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Abstract 


In  1995,  the  C-17  Factory  Simulation  Model  (FSM)  was  developed  for  the  C-17 
System  Program  Office  at  Aeronautical  Systems  Center  (ASC),  Wright-Patterson  AFB, 
Ohio.  Designed  to  enable  analysts  to  address  “what-if  ’  questions  about  the  resources 
required  to  build  &tme  aircraft,  the  FSM  is  based  on  learning  curve  models  that  are  used 
to  both  portray  and  simulate  future  aircraft  production. 

In  this  thesis,  we  examine  and  develop  alternate  learning  curve  models  that  also 
utilize  a  small  amount  of  initial  production  data  (about  20  observations)  to  portray  the 
relationship  between  the  number  of  aircraft  built  and  the  amount  of  resources  required  to 
build  them  The  goal  is  to  identify  a  model  which  not  only  provides  a  good  fit  and  forecast 
based  on  a  small  amoimt  of  data  but  is  also  intuitive  and  reasonably  simple  to  apply.  In 
addition  to  examining  variations  on  the  Log-Linear  Learning  Curve  model,  we  propose 
and  evaluate  the  use  of  Box  and  Jenkins  Autoregressive  Moving  Average  (ARMA) 
models  for  modeling  the  effects  of  learning. 

These  models  are  exercised  in  fitting  simulated  log-linear  data,  as  well  as  in  fitting 
and  forecasting  historical  F-102  manufacturing  data  and  notional  C-17  manufacturing 
data.  The  results  are  somewhat  inconclusive  since  they  do  not  identify  any  one  model  as 
the  best  for  all  data  sets.  They  do,  however,  suggest  that  ARMA  models  are  a  very 
promising  alternative  to  the  standard  log-linear  learning  curve. 

The  thesis  concludes  with  an  examination  of  the  effects  of  ejq)licitly  accounting  for 
uncertainty  in  parameter  estimation  when  simulating  future  performance  based  on  the 
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traditional  log-linear  learning  curve  model.  The  results  show  that  the  approach  employed 
in  the  FSM  is  viable  even  though  it  does  not  directly  account  for  this  uncertainty. 
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AN  INVESTIGATION  OF  LEARNING  CURVES 

AND 

THEIR  USE  IN  SIMULATION 

1.  Motivation  for  the  Study 


1-1.  Background 

One  of  the  fundamental  quantities  of  interest  when  estimating  the  life  cycle  costs  of 
weapon  systems  such  as  the  C-17,  is  an  estimate  of  the  resources  required  to  build  a  given 
number  of  systems.  This  estimate  provides  the  basis  upon  which  many  other  costs  and 
performance  measures  can  be  calculated.  Given  these,  reasonably  accurate  answers  can  be  given 
to  questions  such  as,  “How  will  the  utilization  of  resources  such  as  manpower  or  tooling  be 
affected  by  changes  in  the  ‘buy  profile’  for  the  weapon  system?”  or  “How  will  delivery  dates  be 
affected  by  changes  in  the  assembly  process?”  and  other  “what  if?”  questions  proposed  by 
Congress  and  senior  management  regarding  the  effects  of  various  changes  in  planning  and 
production  strategies.  Models  which  can  help  to  answer  these  kinds  of  questions  are  invaluable 
tools  in  the  estimation  of  life  cycle  costs,  but  their  development  requires  serious  scrutiny  and 
analysis  if  they  are  to  provide  useful  information. 

In  many  production,  assembly,  or  maintenance  operations,  the  amount  of  time  required  to 
complete  a  task  tends  to  decrease  each  time  the  task  is  undertaken.  A  common  approach  in 
modeling  this  phenomenon  is  to  use  a  learning  curve  in  which  the  number  of  hours  required  to 
complete  the  task  is  modeled  as  a  function  of  the  number  of  units  completed  to  date.  This  long¬ 
term  relationship  is  often  estimated  on  the  basis  of  initial  production  data  using  straightforward 
regression  procedures.  The  resulting  fitted  relationship  then  provides  a  forecast  of  the  expected 
number  of  hours  that  will  be  required  for  the  task  to  be  performed  in  future  operations  . 
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When  simulating  such  production  processes,  one  needs  not  only  to  portray  the  expected 
number  of  hours  required  in  future  production,  but  also  how  those  hours  can  be  expected  to  vary 
about  their  mean.  For  example,  in  the  C-17  Factory  Simulation  Model  (FSM),  which  is  a  large- 
scale  model  of  the  C-17  assembly  process  recently  developed  for  the  C-17  System  Program 
Office  (SPO),  this  was  handled  by  treating  the  fitted  regression  equation  (based  on  data  collected 
during  the  assembly  of  the  first  20  aircraft)  as  if  it  provided  a  perfect  forecast  of  the  mean  time 
required  to  complete  a  task  in  the  future.  Once  this  relationship  was  determined,  random  errors 
were  generated  about  the  fitted  curve  to  simulate  how  actual  performance  might  vary  in  the 
future.  The  validity  of  this  procedure  is  somewhat  in  doubt,  however. 

An  important  source  of  doubt  centers  on  how  well  this  strategy  can  be  expected  to  model 
future  performance.  This  is  an  especially  important  question  because  the  underlying  learning 
curves  have  been  estimated  on  the  basis  of  a  limited  amount  of  data.  The  approach  used  in  the 
C-17  FSM  does  not  take  into  account  the  uncertainty  surrounding  the  mathematical  form  of  the 
learning  curve  model  nor  the  values  of  the  parameters  within  that  model.  For  example,  might  the 
relationship  be  more  accurately  described  with  of  an  equation  of  a  different  form?  Alternately, 
how  (if  at  all)  should  predictions  based  on  the  fitted  relationship  account  for  one’s  uncertainty 
about  the  values  of  the  parameters  which  specify  that  relationship?  Should  the  simulation  of  the 
production  of  future  aircraft  account  for  this  uncertainty?  Further  analysis  of  the  appropriateness 
of  this  strategy  and  an  investigation  into  possible  alternative  strategies  is  clearly  warranted. 

1-2.  Problem  Statement 

An  ancilysis  of  the  current  methods  used  in  the  C-17  FSM  and,  potentially,  the 
development  of  improved  models  and  methods  for  modeling  and  simulating  learning  curve 
relationships  will  enable  analysts  to  provide  better  answers  to  the  kinds  of  questions  which  are 
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asked  in  the  acquisition  process,  and  allow  for  better  planning.  This,  in  turn,  will  help  to 
minimize  (or  avoid  for  the  most  part!)  future  cost  and  time  overruns  and  allow  for  a  more 
efficient  allocation  of  resources. 

1-3.  Approach 

In  this  thesis,  we  provide  an  analysis  and  assessment  of  some  of  the  specific  methods 
used  to  model  and  simulate  learning  curve  relationships  as  implemented  in  the  C-17  FSM.  We 
begin  with  a  thorough  literature  review  to  discover  what  is  known  about  learning  curves  and  to 
assess  what  new  thinking  must  be  done.  The  review  covers  a  variety  of  learning  curve  models 
and  the  use  of  learning  curves  to  simulate  future  performance.  We  next  assess  the  adequacy  of 
the  approach  used  in  the  C-17  FSM  and  evaluate  other  possible  approaches  in  an  effort  to 
determine  what  learning  curve  models  might  be  most  appropriate  for  this  application. 

We  also  propose  a  new  approach  to  modeling  and  simulating  learning  using  autoegressive 
[AR(p)]  models  and  AutoRegressive  Moving  Average  [ARMA(p,q)]  models.  Since  we  generally 
do  not  expect  to  have  much  data  upon  which  to  fit  such  models,  it  is  necessary  to  keep  the 
number  of  parameters  (p  and  q)  small;  hence,  we  focus  on  AR(1)  and  ARMA(1,1)  models. 

We  then  attempt  to  make  a  formal  comparison  of  the  models  examined,  comparing  them 
to  each  other,  to  data  similar  to  that  used  in  developing  the  C-17  FSM,  and  to  historical  data  from 
the  F-102  manufacturing  program.  These  comparisons  enable  us  to  draw  some  basic  conclusions 
and  to  make  recommendations  for  future  studies  in  this  area.  We  conclude  with  an  examination 
of  the  effects  of  explicitly  accounting  for  uncertainty  in  parameter  estimation  in  the  traditional 
log-linear  learning  curve  model. 
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2.  Previous  Work  in  Learning  Curves 


2-1.  Definition  of  Learning  Curve 

A  learning  curve,  also  known  as  a  progress,  improvement,  or  experience  curve,  is 
a  graphical  or  mathematical  representation  of  how  the  requirement  for  resources  is 
reduced  as  the  production  of  a  product  or  service  is  repeated.  The  learning  curve  concept 
may  be  used  to  predict  production  costs  from  the  known  costs  of  producing  a  product  or 
service,  the  future  service  time  from  a  history  of  service  times,  or  the  time  required  to 
build  the  n**'  aircraft  from  a  history  of  times  spent  building  previous  copies  of  the  aircraft. 
The  term  “learning”  is  used  in  this  thesis  rather  than  “progress,”  “improvement,”  or 
“experience”  because  of  its  common  use  in  the  United  States  Air  Force  and  most  of  the 
literature.  “Learning,”  as  used  here,  includes  worker  learning,  management  innovations, 
engineering  changes,  and  work  simplification  (Orsini,  1970:pgs  2:3). 

The  aircraft  industry  was  the  first  to  recognize  the  predictive  value  of  learning 
curves.  The  earliest  known  work  on  the  learning  curve  phenomenon  was  done  by  T.P. 
Wright  who  stated  in  his  1936  article,  “Factors  Affecting  the  Cost  of  Airplanes,”  that  he 
“started  his  studies  of  the  variation  of  cost  with  quantity  in  1922.”  (Wright,  1936:122). 

He  hypothesized  that  the  cumulative  average  labor  cost  for  any  quantity  of  airplanes 
produced  decreases  by  a  constant  amount  as  the  quantity  of  airplanes  is  doubled.  To 
calculate  the  cumulative  average  cost,  we  utilize  equation  (2-1). 


where, 

X  is  the  unit  number,  and 
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Yx  is  the  number  of  hours  (cost)  for  unit  X. 

As  an  example  of  an  80%  learning  curve,  if  it  takes  100,000  hours  of  labor  to  produce 
Ship  #1,  it  would  take  an  average  of  80,000  hours  to  produce  Ships  #1  and  #2  (or  60,000 
hours  for  Ship  #2  alone),  an  average  of  64,000  hours  for  the  first  four  ships,  51,200  hours 
for  the  first  eight  ships,  and  so  on. 

In  the  example  above,  it  is  important  to  stress  that  the  numbers  100,000  and 
80,000  and  64,000,  etc.,  are  cumulative  averages;  they  are  the  average  costs  of  producing 
the  first,  first  two,  ans  first  four  aircraft,  respectively.  The  number  describing  the  cost  of 
Ship  #2  alone  (60,000)  is  known  as  its  marginal  or  unit  cost;  it  is  the  cost  of  producing 
the  second  aircraft  alone.  Most  of  the  data  considered  in  this  thesis  is  marginal  or  unit 
cost  data. 

Who  or  what  is  doing  the  learning?  In  organizations,  the  learning  curve  describes 
the  improvement  in  either  individual  productivity  or  organizational  productivity. 
Individual  learning  is  improvement  that  results  when  people  repeat  a  process  and  gain 
skill  or  efficiency  from  their  own  experience.  Organizational  learning  results  from 
practice  as  well,  but  also  comes  from  changes  in  administration,  equipment,  and  product 
design. 

Learning  rates  in  organizations  differ  in  their  performance  for  a  number  of 
reasons.  One  factor  affecting  this  performance,  and  hence  affecting  the  learning  rate,  is 
the  volume  of  the  output;  all  other  things  being  equal,  the  firm  that  has  the  higher 
cumulative  output  should  have  the  lower  cost.  Another  factor  which  weighs  heavily  is 
the  rate  of  output;  studies  have  shown  that  recent  experience  has  much  more  effect  in 
reducing  cost  than  more  distant  experience.  As  a  result,  if  we  compare  two  companies 
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with  the  same  cumulative  output,  the  firm  with  the  higher  rate  should  have  a  lower  cost 
curve  because  its  experience  is  more  recent.  In  organizations,  both  kinds  of  learning 
occur  simultaneously  but  are  most  frequently  modeled  with  a  single  learning  curve. 

2-2.  Terminology 

Before  starting  a  discussion  of  the  various  forms  of  learning  curves  (Section  2-3, 
below)  it  is  imperative  that  the  terminology  used  throughout  the  remainder  of  this  work 
be  clearly  defined.  This  is  important  because  the  terminology  is  not  consistent  between 
references.  Some  of  the  definitions  (those  annotated  with  an  asterisk,  *),  have  been  taken 
directly  from  a  thesis  by  Orsini  (Orsini,  1970:pgs  2:3),  the  others  have  been  gathered 
from  other  sources. 

(1)  direct  man-hours  -  These  are  the  hours  expended  to  manufacture  a  unit  of 
output.  In  the  airframe  industry,  these  hours  consist  of  fabrication,  assembly,  production 
flight,  and  other  production  work  associated  with  the  basic  aircraft  (Orsini,  1970:pgs  2:3). 

(2)  direct  man-hours  for  Unit  One  -  The  total  direct  labor  hours  expended  to 
complete  the  first  operable  unit  (Orsini,  1970:pgs  2:3). 

(3)  leaminp  rate  -  A  per  cent  figure  which  determines  the  number  of  direct  man¬ 
hours  required  for  each  doubled  production  quantity  in  relation  to  the  previous  doubled 
quantity.  For  example,  if  a  learning  curve  has  an  eighty  percent  learning  rate,  the  direct 
man-hours  required  to  produce  unit  2  will  be  eighty  percent  the  number  required  to 
produce  unit  one;  the  direct  man-hours  required  to  produce  unit  four  will  be  eighty 
percent  the  number  required  to  produce  unit  two;  the  direct  man-hours  required  to 
produce  unit  eight  will  be  eighty  percent  the  number  required  to  produce  unit  four;  etc. 
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(4)  cost  -  Refers  to  the  quantity  of  resource  required  to  perform  a  given  activity. 


The  cost  could  be  in  terms  of  dollars,  direct  man-hours,  or  other  resource. 

(5)  eumulative  average  cost  eurve  -  A  curve  representing  the  cumulative  average 
cost  as  the  total  number  of  units  increases. 

(6)  unit  marginal  cost  curve  -  A  curve  representing  the  average  cost  of  each  unit  as 
the  total  number  of  units  increases. 

2-3.  Geometric  Versions  of  the  Learning  Curve 

The  Log-Linear  (Crawford)  Model.  The  most  useful  and  most  widely  used 
learning  curve  model,  the  log-linear  or  constant  percentage  model,  was  first  introduced  in 
1944  (Smith,  1989).  This  model,  which  states  that  the  improvement  in  productivity  is 
fairly  constant  as  output  increases,  is  depicted  in  Figure  2-1. 

Figure  2-1  Example  of  a  Log-Linear  Model 


Log-Linear  Model 


The  equation  for  this  model  is 


Y^aX'’ 


(2-2) 
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where 


Y  is  the  cumulative  average  amount  of  resources,  or  ‘cost,’  of  producing  the  first 
X  units, 

X  is  the  unit  number, 

a  is  the  ‘cost’  of  producing  the  first  unit,  and 
b  is  a  constant  which  determines  the  learning  rate. 

The  learning  percentage  (learning  rate),  p,  can  be  determined  via 

p  =  2\  (2-3) 

Equivalently,  for  a  given  learning  percentage,  b  can  be  set  by  specifying 

b  =  ln(p)An(2).  (2-4) 

In  order  to  estimate  the  parameters,  a  and  b,  of  the  model  given  above,  a  linear 
regression  using  a  logarithmic  transformation  is  conducted.  The  resulting  model  is 

ln(Y)  =  ln(a)  +  b*ln(X).  (2-5) 

According  to  Forsythe  et  al,  there  are  some  distinct  disadvantages  which  arise 
from  taking  this  standard  approach.  First  of  all,  the  large  variability  associated  with  the 
early  observations  will  skew  the  estimated  parameters  which  will  then  be  dominated  by 
the  first  few  observations.  Second,  the  per  unit  cost  predicted  by  a  log-linear  learning 
curve  will  converge  to  zero  for  sufficiently  large  volume.  If  convergence  to  zero  occurs 
within  a  realizable  volume,  the  log-linear  model  produces  unrealistic  forecasts. 

Both  the  problem  of  early  observations  skewing  the  estimated  parameters,  and  the 
fact  that  the  log-linear  learning  curve  will  converge  to  zero  for  sufficiently  large  volume 
are  important  considerations  since  they  are  far  more  pronounced  when  applied  to  unit 
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costs  as  opposed  to  cumulative  average  costs.  This  is  especially  important  to  us  because 
the  C-17  FSM  implicitly  applies  the  log-linear  model  to  unit  costs. 

PesePs  Exponential  Function.  In  an  effort  to  develop  more  realistic  forms  of 
the  learning  curve,  some  authors  have  proposed  models  which  are  based  on  exponential 
functions.  One  of  these  authors,  Pegel,  proposed  an  algebraic  exponential  function  to 
complement  the  power  function  (log-linear)  model  (Smith,  1989).  Pegel’s  model  gives 
the  marginal  cost  per  unit  for  the  X'*'  unit  as 

MC(X)=  P  (2-6) 

where 

a,  a,  and  p  are  empirically  based  parameters. 

Using  a  =  1000,  a  =  .8,  and  P  =  100,  the  curve  appears  as  shown  in  Figure  2-2. 


Figure  2-2  Pegel’ s  Exponential  Function 


value  of  100  which  is  given  by  the  value  of  p.  This  model  may  be  integrated  across  X  to 
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find  the  total  cost  up  to  the  X*  unit  in  terms  of  the  marginal  cost  per  unit.  It  may  also  be 
algebraically  manipulated  to  give  the  average  cost  of  the  first  X  units  in  marginal  cost 
index  terms.  The  biggest  advantage  in  Pegel’s  model  is  that  where  the  basic  power 
function  shows  both  the  average  and  marginal  costs  decreasing  with  increasing  output, 
Pegel’s  exponential  function  shows  that  the  marginal  cost  becomes  constant  after  a 
certain  number  of  units  is  produced  (Belkaoui,  1986:pgs  8:9). 

Forsythe  Cmin  Approach.  A  similar  approach  which  addresses  and  includes  the 
constant  nature  of  marginal  cost  mentioned  above  is  proposed  by  Forsythe  (Forsythe, 
Green,  White,  and  Elmer,  1995:  pgs  3:4).  He  says  that  it  is  possible  to  tailor  specific 
model  parameters  to  account  for  characteristics  of  the  manufacturing  process.  For 
example,  when  the  absolute  minimum  number  of  hours  required  to  produce  a  unit  is 
known  or  can  be  estimated,  the  basic  log-linear  model  can  be  revised  as  shown  below. 

Y(X)  =  a](^  +C,nin  (2-7) 

In  this  equation,  Cmin  is  the  minimum  time  needed  to  produce  a  unit.  As  the  log-linear 
part  of  this  equation  approaches  zero  with  large  values  of  X,  the  time  required  to  build 
the  Xth  unit  approaches  Cmin;  this  makes  intuitive  sense.  This  adjustment,  like  Pegel’s 
Exponential  Function,  prevents  the  time  to  produce  a  unit  from  reaching  the  unrealistic 
value  of  zero  as  the  quantity  produced,  X,  gets  large. 

Lew’s  Adaptation  Function.  Another  twist  on  the  marginal  cost  theme  is  given 
by  Levy’s  Adaptation  Function  (Levy,  1966).  Levy’s  function  is  useful  for  showing  how 
a  firm  can  adapt  itself  to  the  learning  process  and  isolate  the  variables  which  influence 
learning.  His  function  is  given  below. 
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MC  =  /7/p  -  (7/p  - 


(2-8) 


where 

MC  is  the  marginal  cost, 

a  and  b  are  parameters  analogous  to  the  power  function  parameters, 

C  is  analogous  to  Cmin,  and 
p  is  the  production  index  for  the  first  unit. 

The  parameter,  C,  serves  to  flatten  the  curve  for  large  values  of  X.  So,  just  as  Pegel  and 
Forsythe  propose,  the  learning  function  reaches  a  plateau  and  does  not  continue  to 
decrease  (or  increase)  the  way  the  log-linear  function  does  (Belkaoui,  1986:  pg  10). 

The  Stanford-B  Model.  One  of  the  most  well-known  learning  curve  models  is 
the  Stanford-B  Model.  Like  the  models  described  above,  this  model  came  into  existance 
because  the  log-linear  model  doesn’t  always  provide  the  best  fit  to  activity/time  data.  The 
Stanford-B  formula,  also  known  as  the  learning  formula  with  the  B-factor  (Summers  and 
Welsch,  1970:  pgs  45:50),  is  given  by  the  following  equation. 

Y=a(X+Bf  (2-9) 

where 

Y  is  direct  man-hours  required  for  cumulative  unit  number  X, 
a  is  a  constant  which  is  equivalent  to  the  cost  of  the  first  unit  when  B=0, 
n  is  the  exponent  which  describes  the  slope  of  the  asymptote  (-0.5  is  typical), 

B  is  a  constant  which  may  be  expressed  as  the  number  of  units  theoretically 
produced  prior  to  the  first  unit  acceptance.  (B  is  typically  between  0  and  10 
with  4  being  a  common  value). 


2-8 


Given  the  ‘typical’  values  above,  the  Stanford-B  curve  is  depicted  in  Figure  2-3. 

Figure  2-3  Stanford-B  Curve 


Stanford-B  Curve 


Unit  Number 


The  main  feature  of  the  model  is  the  B  factor  which  measures  variations  in  design 
or  other  complexities  which  management  cannot  control  through  engineering  or  retooling. 
Another  way  to  think  of  this  model  is  that  it  accounts  for  previous  learning  by  using  the  B 
factor  as  a  scale  of  displacement. 

To  illustrate,  Belkaoui  (Belkaoui,  1986:pg  11)  suggests  we  note  that  when  n  is 
set  equal  to  -0.5,  the  Stanford-B  model  yields  a  unit  learning  curve  equation  as  follows: 


a 


x-\-  B 


(2-lOa) 


If  B  is  set  to  0,  this  becomes 


ax 


-0.5 


(2- 10b) 


Using  the  equation 

p=2'’  =  2-^^  =  .707,  (2- 10a) 

we  can  see  that  this  is  equivalent  to  a  70.7  percent  log-linear  learning  curve.  Smith  notes 
that  the  Stanford-B  isn’t  used  much  in  the  aircraft  industry  anymore  (Smith,  1989). 
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DeJong’s  Learning  Formula  with  an  Incompressibility  Factor.  Another  form 


of  the  log-linear  model,  which  is  similar  to  the  models  proposed  by  Pegel,  Forsythe,  and 
Levy  (in  that  the  value  of  MC  levels  out  after  a  large  number  of  builds)  is  DeJong’s 
Learning  Formula  with  an  Incompressibility  Factor.  This  model  takes  into  account  the 
differing  nature  of  manual  and  machine  labor.  Delong  refers  to  manual  activity  time  as 
being  compressible  (subject  to  learning);  he  considers  machine  activity  time  to  be 
incompressible.  His  model  has  the  following  form  which  consists  of  a  variable  part 
representing  manual  activity  and  a  fixed  part  representing  machine  activity. 


MC«a{M  +  \^)  (2-11) 

/  \ 

Fixed  Part  Variable  Part 


where 

MC  is  the  marginal  time  for  the  xth  unit, 

a  and  n  are  parameters  analogous  to  the  power  function  parameters,  and 

M  is  the  factor  of  incompressibility. 

The  value  of  M  varies  between  0  and  1  and  would  be,  approximately,  .25  for  manually 
dominated  activities  and  .50  for  machine-dominated  activities.  Note  that  for  an  activity 
which  is  100%  manual,  M  would  be  zero  and  DeJong’s  formula  reduces  to  the  log-linear 
model.  For  an  activity  which  is  100%  machine  dominated,  M  equals  1  and  MC  is  a 
constant.  Since  we  really  don’t  expect  machines  to  be  able  to  improve  their  performance 
this  makes  intuitive  sense.  Some  learning  curve  prognosticators  (Smith,  1989:  pg  7)  have 
suggested,  however,  that  machines  can  ’learn’  through  the  adjustment  of  machine 
parameters  or  production  methods.  Smith  suggests  that,  “The  DeJong  model  may  not  be 
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particularly  useful  in  machine  intensive  industries,  especially  considering  the  model 
adds  so  much  more  complexity  to  learning  curve  mathematics  and  almost  eliminates  any 
intuitive  understanding  of  real-life  application  to  the  typical  factory  worker  or  cost 
estimator.”  (Smith,  1989) 

The  S-Curve.  The  S-Curve  is  another  curve  whieh  has  been  used  to  model  the 
learning  process  seen  in  manufacturing.  An  S-type  function  can  be  represented  by  a 
composite  function  whose  shape  resembles  the  flattened  horizontal  S  shown  below. 


Figure  2-4.  The  S-Curve 


Note  the  flatness  of  the  earliest  part  of  the  curve.  This  can  be  attributed  to  the  fact 
that  experiments  are  being  made  on  the  production  process  in  an  effort  to  find  the  best 
tooling  and  methods.  The  numerous  mid-course  changes  made  at  this  time  preclude  rapid 
improvement  in  the  amount  of  time  to  build  each  unit.  Once  all  the  corrections  are  made 
to  the  toolings  and  methods,  it  is  possible  to  obtain  very  rapid  learning;  this  is  shown  in 
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the  center  part  of  the  curve.  Finally,  once  most  of  the  learning  has  been  done,  the  curve 
starts  leveling  out  and  approaches  a  minimum  time  required  to  build  an  individual  unit. 

As  mentioned  above,  the  S-Curve  can  be  represented  by  a  composite  function. 
According  to  Belkaoui  (Belkaoui,  1986:pg  14)  we  can  constuct  an  S-function  by  using  the 
Stanford-B  curve  to  represent  the  early  part  of  the  curve  and  the  DeJong  curve  to 
represent  the  latter  part  of  the  curve.  In  this  case,  the  S-Curve  function  would  look  like: 

MC  =  a[M  +  {l-M){X-\-By]  (2-12) 

where  (as  before): 

MC  is  the  marginal  time  for  the  Xth  unit, 

a  and  n  are  parameters  analogous  to  the  power  function  parameters, 

M  is  the  factor  of  incompressibility,  and 

B  is  a  constant  which  may  be  expressed  as  the  number  of  units  theoretically 
produced  prior  to  the  first  unit  acceptance.  (B  is  between  0  and  10  with  4 
being  a  typical  value). 

Like  all  models,  S-shaped  curves  have  their  advantages  and  dissadvantages. 

These  curves  can  model  more  complex  learning  curves  which  do  not  follow  a  simple 
exponential  form.  However,  when  we  have  initial  production  data,  only  S-shaped  curves 
may  not  have  sufficient  data  to  accurately  estimate  long-term  production. 

Non-Linear  Estimation.  Thomas  (Thomas,  1975)  compared  log-linear  and 
nonlinear  estimation  techniques  for  the  standard  learning  curve  model.  Each  model 
assumes  that  the  error  is  distributed  normally  under  the  logarithmic  transformation  and 
that  it  tends  to  be  proportional  to  the  value  of  the  dependent  variable  and  multiplicative  in 
nature.  The  nonlinear  estimation  techniques,  on  the  other  hand,  generally  assume  a 
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constant  error  term  which  is  additive.  Thomas  found  that  the  nonlinear  estimation 
techniques  were  robust  and  performed  well  in  estimating  the  parameters  of  the  model 
with  either  error  distribution;  the  log-linear  model  did  not  perform  as  well  when  the  error 
was  additive  and  constant  in  nature. 

2-4.  Summary 

Over  the  course  of  the  last  eighty  years,  the  learning  curve  has  been  used 
extensively  as  a  management  accounting  tool;  this  type  of  curve  can  provide  important 
information  for  decision  making.  In  this  chapter,  we  have  presented  the  concept  of  the 
learning  phenomenon  and  have  gone  on  to  briefly  outline  not  only  the  theory  behind  the 
basic  log-linear  learning  curve  but  some  ‘new’  geometric  versions  as  well. 

The  Crawford  Model  (or  the  basic  log-linear  learning  curve  model)  states  that  the 
improvement  in  productivity  is  fairly  constant  as  output  increases.  The  main  shortcoming 
of  this  model  lies  in  the  fact  that  as  the  number  of  units  produced  increases,  the  time/cost 
to  produce  them  approaches  the  unrealistic  value  of  zero.  The  Forsythe  and  Pegel 
models  are  more  realistic  than  the  Crawford  since  they  include  a  minimum  cost  per  unit 
which  does  not  allow  the  unrealistic  approach  towards  zero.  The  Stanford-B  model 
attempts  to  account  for  prior  learning  by  including  a  displacement  or  B  factor. 

The  models  mentioned  in  the  preceding  paragraph,  seem  to  be  worth  further 
investigation  since  they  are  fairly  simple  and  are  intuitive  for  the  most  part.  On  the  other 
hand,  due  to  their  complex  nature,  we  will  not  devote  further  study  to  of  DeJong’s 
Learning  Formula,  and  Levy’s  Adaptation  Function  within  this  thesis.  Although  the  S- 
Curve  has  the  ability  to  model  more  complex  learning  functions  which  do  not  follow  the 
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simple  exponential  form,  it  may  also  be  unsuitable  due  to  its  complexity.  Further,  the  S- 
Curve  model  may  require  more  than  just  initial  production  data  if  they  are  to  accurately 
estimate  long  term  production;  initial  production  data  is  frequently  the  only  data  we  have. 

Variations  on  the  basic  learning  curve  have  proliferated  mainly  because  the  log- 
linear  model  does  not  always  provide  a  good  fit  or  forecast  for  the  data  at  hand.  The  next 
chapter  explores  a  completely  different  class  of  models  as  an  alternative  method  of 
predicting  future  aircraft  build  times;  enter  the  ARIMA  models. 
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3.  Proposed  ARMA  Learning  Curve  Models 


3-1.  Introduction  to  ARMA  Models 

With  the  advent  of  widespread  computer  availability  in  organizations,  the  general 
and  statistically  based  methods  of  time-series  analysis  know  as  Box- Jenkins  or  ARIMA 
processes  have  been  developed  and  applied  to  forecasting  (Makridakis,  Wheelwright,  and 
McGee,  1983).  ARIMA,  which  is  an  abbreviation  for  autoregressive  (AR)  integrated  (I) 
moving  average  (MA),  describes  a  broad  class  of  time-series  models.  Before  we  present  a 
detailed  discussion  of  the  two  basic  ARIMA  processes  which  are  of  interest  in  this  study, 
we  briefly  define  some  of  the  terms  used  in  the  rest  of  this  chapter  and  discuss  why  these 
processes  are  being  explored. 

The  following  definitions  are  based  on  discussions  found  in  Makridakis, 
Wheelwright,  and  McGee  (1988),  Box  and  Jenkins  (1976),  and  Mongomery,  Johnson,  and 
Gardiner  (1990).  A  process  is  stationary  if  its  statistical  properties  are  independent  of  the 
particular  time  during  which  it  is  obseived.  Specifically,  if  the  process  underlying  a  time 
series  is  based  on  a  constant  mean  and  variance,  then  the  time  series  is  said  to  have  a 
stationary  mean  and  variance.  If  a  process  is  stationary,  the  Box  and  Jenkins  model  used 
is  generally  an  ARMA(p,q).  As  discussed  in  subsequent  sections  of  this  chapter. 

A  process  is  nonstationarv  if  its  statistical  properties  (especially  its  mean  and 
variance)  depend  on  the  time  during  which  it  is  observed.  In  other  words,  if  the  process 
underlying  the  time  series  does  not  have  a  constant  mean  and/or  a  constant  variance,  it  is 
nonstationary.  If  a  processes  is  non-stationary,  the  Box  and  Jenkins  model  generally  used 
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is  an  ARIMA(p,d,q)  with  the  differencing  terra,  d,  not  equal  to  zero.  In  this  thesis, 
attention  will  be  focused  on  stationary  models. 

An  ARIMA  model  is  parsimonious  if  it  uses  as  few  parameters  as  possible  in  the 
model/data  fitting  process.  For  example,  if  we  assume  that  an  AR(1)  process,  which  has 
one  parameter,  and  an  AR(2)  process,  which  has  two  parameters,  both  provide  reasonable 
fits  to  a  particular  data  series,  the  concept  of  parsimony  would  have  us  choose  the  AR(1) 
model  over  the  AR(2). 

An  autoregressive  process  is  a  form  of  regression  where,  instead  of  the  dependent 
variable  (the  item  to  be  forecast)  being  related  to  independent  variables,  it  is  related  to 
past  values  of  itself  at  varying  time  lags.  Thus,  an  autoregressive  model  would  express  the 
observation  at  time  t  as  a  function  of  previous  values  of  that  time  series.  This  matches  up 
well  with  the  logic  behind  learning  curves  since,  if  learning  is  indeed  occurring,  one  would 
expect  that  the  amount  of  resources  required  to  produce  a  given  unit  would  depend,  or  be 
related  to,  the  amount  of  resources  required  to  produce  previous  units. 

A  moving  average  process  is  a  process  in  which  the  value  of  the  time  series  at  time 
t  is  influenced  by  a  cureent  error  term  and  (possibly)  weighted  error  terms  from  the  past. 
The  error  terms  of  which  we  speak,  are  independent  and  identically  distributed  random 
noises  or  shocks  that  are  generally  uncontrollable  or  unpredictable. 

Autoregressive/moving  average  (ARMA)  schemes  can  be  autoregressive  (AR)  in 
form,  moving  average  (MA)  in  form,  or  a  combination  of  the  two  (ARMA).  In  an  ARMA 
model,  the  series  to  be  forecast  is  expressed  as  a  function  of  both  previous  values  of  the 
series  (autoregressive  terms)  and  previous  random  errors  (the  moving  average  terms). 
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In  the  next  three  sections  I  will  discuss,  in  somewhat  more  detail,  the  nature  and 
structure  of  three  ARIMA  processes,  AR(p),  MA(q),  and  ARMA(p,q),  (more  formally 
known  as  ARIMA(p,0,0),  ARIMA(0,0,q),  and  ARIMA(p,0,q)  processes  respectively) 
which  are  of  interest  in  this  study.  We’ll  also  consider,  the  suitability  of  each  model  to 
simulate  learning. 

3-2.  Autoregressive  Processes 

As  stated  above,  autoregressive  (AR)  processes  are  a  form  of  regression  where  the 
dependent  variable  is  related  to  past  values  of  itself  at  varying  time  lags.  This  might  seem 
contrary  to  regression  methods  which  attempt  to  forecast  variations  in  some  variable  of 
interest,  the  dependent  variable,  on  the  basis  of  variations  in  a  number  of  other  factors,  the 
independent  variables.  The  general  foim  of  AR  processes  may  be  developed  by  starting 
with  the  basic  causal  or  explanatory  regression  equation  which  has  the  form 

Y  —  bo  "t-  biXj  +  +  bkXk  +  e  (3-1) 

where 

Y  is  the  dependent  variable, 

X],  X2,... ,  Xk,  are  the  independent  variables, 

bo,  b],  b2, ... ,  bk  are  the  linear  regression  coefficients,  and 

e  is  an  error  term  which  is  assumed  to  have 

E[e]  =  0  and  Var  [e]  =  <^. 

The  independent  variables,  Xi,  X2,... ,  Xk,  can  represent  any  factors  such  as  the  number  of 
parts  to  be  installed,  the  quality  of  parts,  the  type  of  aircraft  to  be  built,  while  the 
dependent  variable,  Y,  could  represent  the  time  it  took  to  build  an  aircraft. 
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Some  principles  of  regression  can  be  applied  to  time  series  methods.  We  could, 
for  example,  allow  the  independent  variable  to  represent  previous  values  of  Y,  the  time 
required  to  build  the  aircraft  at  previous  times  in  the  past,  and  suggest  that  the  time  it 
takes  to  build  the  t*  aircraft  depends  not  on  the  number  of  parts  to  be  installed  or  on  the 
quality  of  the  parts,  but  on  the  time  it  took  to  build  the  t-1^,  and  even  t-2"**  and  etc., 
aircraft  in  the  past.  We  could  define  the  dependent  variable  as 

^(■=11^  +  +  ^2^1-2  +  ^3Yi-3  +  ^4Yt^  +  ....  +  ^kYt-k  +  Ot  (3-2) 

where 

Y,  is  the  value  of  the  dependent  variable  at  time  t, 

Yt.j,  Yi.2,  Y,.3,  I'm.  ...  >  Yi.k,  are  the  ‘independent’  variables,  which  represent  the 
prior  values  of  the  dependent  variable, 

[i',  <!>/.  ... .  ^k  are  the  linear  regression  coefficients,  and 

Ct  is  an  en'or  term  which  is  assumed  to  have 

E[e,]  =  0  and  Var  [e,]  =  o^. 

This  is  called  the  basic  autoregressive  (AR)  form  since  the  independent  factors  are  simply 
time-lagged  values  of  the  dependent  variable. 

One  of  the  AR(p)  models  which  I  explore  as  part  of  this  thesis  is  the  AR(1)  model. 
The  model  takes  the  following  foim 

Y,  =  [i'  +  4);  Y,.j  +  e,  (3-3) 

where  the  the  variables  are  the  same  as  defined  in  equation  (3-2).  Before  plotting  the 
AR(1)  model  given  in  equation  (3-3),  we’ll  define  the  difference  between  unconditional 
and  conditional  means. 
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If  (j);  is  chosen  between  -1  and  +1,  the  AR(1)  process  is  stationary  with  mean 


£(y)  =  p=-y^.  (3-4) 

l-<Pi 

Since  we  make  no  assumption  about  previous  values  observed,  this  is  referred  to  as  the 
unconditional  mean.  The  unconditional  mean  differs  from  the  conditional  mean  in  that  the 
conditional  mean  takes  previous  values  of  the  time  series  into  account.  For  example, 
suppose  for  an  AR(1)  process,  the  value  observed  at  time  t-1  is  yn.  Then  the  expected 
value  at  time  t  given  this  information  is  given  by: 

=  £(p.  +41^-1  ~  Vt-i) 

=  (3-5) 

where  we  assume  that  E(£t)  =  0.  The  quantity  in  equation  (3-5)  is  not  necessarily  equal  to 
|i.  In  particular, 

£(72111  =  }'i)-h'+<l'iJ 


and  similarly 


^dr'l!  =yi)  =  ^(F’-Ht'il^2  +  e2) 

=  p’+(t)i£(72l7i=yi)  +  £(e2) 


In  general,  for  t  >  2, 


This  illustrates  the  fact  that  the  initial  conditions  (the  value  observed  at  time  1)  has 
decreasing  effect  on  the  long  run  provided  that  -1  <  (j)i  ^  +1.  In  fact,  in  the  long  run, 

\im  E(,Y,\Yi  =  y)  =  |i 


t-2 

=  liin 
t  «>L  ;=1 


(3-7) 


provided  -1  <  <|)1  <  +1.  The  quantity  given  in  equation  (3-7)  is  the  unconditional  mean. 
To  see  why  the  the  unconditional  mean  is  useful  in  simulating  the  effects  of 

learning,  suppose  p,'  =  25,  (l)i  =  0.75  so  that  |i  =— =  100.  We  can  then  generate 


simulated  values  of  Y,  for  t  =  2, ... ,  50  assuming  that  Yi  =1000  and  the  Ct’s  are  uniformly 
distributed  with  E(et)  =  0  and  Var(et)  =  25.  The  plot  which  results  from  using  the  above 
assumed  values  in  equation  (3-3)  (the  equation  forAR(l))  is  given  in  Figure  (3-1) 
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Figure  3-1  Dlustration  of  the  AR(1)  Model 


ARII\/IA(1,0,1) 


Unit  Number 


Figure  3-2  Illustration  of  the  Log-Linear  Learning  Curve  Model 


Log-Linear  Learning  Curve  with  N(0,25)  Error 


In  Figure  3-2,  we  have  a  plot  of  a  typical  learning  curve  for  comparison.  Note  the 
similarity  between  the  AR(1)  curve  and  the  typical  learning  curve.  The  main  difference 
between  the  two  curves  is  that  in  the  long  run,  the  AR(1)  model  approaches 
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£(r)  =  ^l  =  -^=100 

while  the  learning  curve  continues  to  approach  zero.  As  a  result,the  AR(p)  model  may 
actually  prove  to  be  more  useful  for  predicting  learning  types  of  data. 

3-3.  Moving  Average  Processes 

In  a  manner  analogous  to  writing  the  basic  regression  equation  (3-1)  in  terms  of 
past  values  of  the  dependent  variable  to  define  the  AR  process  given  by  equation  (3-2), 
we  define  an  MA  process  by  writing  the  basic  regression  equation  in  terms  of  past  error 
terms;  in  other  words,  we  let  the  past  error  terms  be  the  independent  variables.  The 
general  equation  for  an  MA  model  is 

Yt=  \i.  +  9yCr-;  +  02^(-2  +  ^3^t-3  +  •••  +  ^k^t-k  +  (3-8) 

where 

F,  is  the  value  of  the  dependent  variable  at  time  t, 

et ,  ei-i,  61-2,  et.3,... ,  eu,  are  the  ‘independent’  variables,  which  represent  the 

time-lagged  error  terms.  These  error  terms  are  defined  as  et  =  Xt-  Yt, 
en  =  Xt-i  -  Yt-i,  et-2  =  Xt.2  -  Yt.2,  and  so  on.  X  is  the  actual  value  while  Y 
is  the  forecasted  value. 

|i,  0y,  02, ... ,  0t  are  the  linear  regression  coefficients. 

One  of  the  MA(q)  models  which  we  explore  as  part  of  this  thesis  is  the  MA(1) 
model.  The  model  takes  the  following  form 

Yt  =  p.  +  et  -  0iet.i .  (3-9) 

To  assess  the  usefulness  of  the  MA(1)  models  for  forecasting  learning  data  we  fix  Yi  = 
1000,  01  =  -  0.75,  and  assume  the  errors  are  ~  N(0,25);  specifically. 
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E[e,]  =  0  and  Var  [e,]  =  =  25. 


Since  we  wish  to  fix  Yi  =  1000,  this  implies 

1000  =  100  +  ei  -  Oieo .  , 

If  we  assume  eo  =  0,  then  we  should  have  ei  =  900.  After  time  1,  we  go  on  to  compute  ten 
more  values  of  Yt  using  equation  (3-10). 

Yt=  lOO  +  et-Oiet.i  (3-10) 

The  plot  of  the  resulting  data  is  given  in  Figure  (3-3). 

Figure  3-3  Illustration  of  MA(1)  Model 


Observe  that  Yi,  the  cost  of  unit  1,  is  1000  (by  design).  The  value  of  Y2  is  affected  by  the 
large  error  at  unit  1.  By  unit  3,  the  effect  of  unit  one  on  the  cost  seems  to  be  almost 
completely  gone.  Since  the  effect  of  unit  one  dissipates  so  quickly,  we  feel  that  MA 
models  will  not  be  useful  for  forecasting  learning  type  data. 

3-4.  AutoRegressive/Moving  Average  Processes 

As  stated  in  Section  1,  ARM  A  schemes  can  be  autoregressive  (AR)  in  form, 
moving  average  (MA)  in  form,  or  the  two  can  be  effectively  coupled  to  foira  a  very 
general  and  useful  class  of  time-series  models  called  autoregressive/moving  average 
(ARMA)  processes.  In  the  ARMA  model,  the  series  to  be  forecast  is  expressed  as  a 
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function  of  both  the  previous  values  of  the  series  (the  autoregressive  terms)  and  the 
previous  random  errors  (the  moving  average  terms).  The  basic  elements  of  AR  and  MA 
processes  can  be  combined  to  produce  a  large  variety  of  models.  As  part  of  this  study  we 
explore  the  ARMA(1,1)  model  which  has  the  following  form 

Yt  =  -  6yg(.;.  +  Ct  (3-11) 

AR(  1 )  part  Constant  MA(  1 )  part 

Here,  the  dependent  variable,  Yt  .depends  on  one  previous  value  of  the  dependent 
variable,  Yu,  one  previous  error  term,  et-i,  and  a  constant  which  adjusts  the  mean  of  the 
process.  Under  specific  conditions  on  the  6‘^,  ARMA  models  are  considered  stationary  in 
both  the  mean  and  the  variance.  A  plot  of  a  simulated  ARMA(1,1)  process  is  shown 
below. 


Figure  3-4  Illustration  of  the  ARIMA(  1,0,1)  Model 


ARMA(1,1)  Model 


This  particular  model  is  calculated  using  (j)!  and  6i  equal  to  .75,  [Y  equal  to  25,  and  Yi 
equal  to  1000.  The  initial  value  for  eiTor,  ei,  was  taken  as  zero  since  the  first  unit  cost 
was  given  with  certainty.  The  remaining  eiTors  have  an  N(0,25)  distribution.  Note  how 
the  curve  levels  out  after  about  twenty  units  to  a  mean  of  approximately  100;  this  is  similar 
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to  the  curve  calculated  for  the  AR(1)  model  in  Section  3-2.  ARMA  models  appear  to  be 
well  suited  to  forecasting  learning  type  data. 

3-5.  ARIMA  Models  for  Learning 

As  discussed  earlier,  a  major  criticism  of  the  basic  log-linear  learning  curve  is  that 
it  does  not  completely  level  out  in  the  long  run;  it  continues  to  approach  zero  through 
successive  unit  builds.  We  find  this  pleasing,  ‘though  unrealistic,  because  it  implies  that  if 
we  buUd  enough  aircraft,  it  eventually  will  take  no  time  at  all  to  build  them! !  In  Chapter 
2,  this  problem  was  addressed  through  the  use  of  modified  learning  curve  models  such  as 
the  Stanford  B  Model  and  Pegel’s  Exponential  Function  which  both  provided  curves  that 
level  out  in  the  long  run.  In  this  chapter,  however,  we  have  presented  examples  of  ARMA 
models  that  exhibit  a  similar  pattern  of  a  dramatic  initial  decline  followed  by  a  leveling  out 
to  a  long-run  mean.  This  suggests  that  stationary  ARMA  models  might  be  useful  for 
modeling  the  build  history  of  an  aii'craft  manufacturing  process. 

In  the  next  chapter,  we  attempt  to  fit  ARMA  models  to  real  and  simulated  sets  of 
learning  data,  and  compare  the  results  with  those  of  fitting  standard  learning  curve  models 
to  the  same  data.  In  the  next  section,  however,  we  first  exploit  a  commercial  forecasting 
software  package,  FORECAST  PRO,  to  fit  various  ARMA  models  to  the  first  50 
observations  from  a  simulated  log-linear  learning  cuiwe.  The  purpose  of  this  exercise  is 
simply  to  assess  how  well  stationary  ARMA  models  can  represent  traditional  log-linear 
learning  curves,  especially  in  the  short-run.  (In  the  long-run,  we  know  that  a  stationary 
ARMA  model  converges  to  its  unconditional  mean  while  the  log-linear  learning  curve 
converges  to  zero.) 
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3-6.  Fitting  ARMA  Models  to  a  Log-Linear  Learning  Curve 

To  represent  a  log-linear  learning  curve,  we  simulate  the  first  50  values  from  a 
model  of  the  form 

Y=aX^  +e  (3-12) 

where  (as  presented  in  Chapter  2) 

Y  is  the  number  of  hours,  or  ‘cost,’  of  producing  the  X*  unit, 

X  is  the  unit  number, 

a  is  the  ‘cost’  of  producing  the  first  unit, 

b  is  a  constant  which  determines  the  learning  rate,  and 

e  is  error  uniformly  distributed  (1%  above  and  1%  below  the  expected  value  of  Y 
for  each  X. 

Arbitrarily,  and  for  purposes  of  illustration,  we  let  the  parameter,  a,  the  cost  of  producing 
the  first  unit,  be  equal  to  1000.  We  additionally  assume  the  learning  rate,  p,  to  equal 
eighty  percent  (0.8),  and  then  calculate  the  value  of  b  using  equation  (2-3)  as  follows: 

b  =  ln(p)/ln(2)  =ln(0.8)/ln(2)  =  -.522.  (2-3) 

For  our  value  of  p,  b  turns  out  to  equal  -  0.322.  The  final  model  which  we  use  to 
generate  data  for  units  1  through  50,  takes  on  the  following  form: 

Y=  1000X°-^^^  +&  (3-13) 

To  generate  the  data,  we  start  by  using  the  spreadsheet  software  package,  EXCEL 
5.0,  by  Microsoft  Corporation,  to  generate  the  values  of  Y for  units  1  through  50  by  using 
equation  (3-13).  Since  the  equation  used  to  calculate  the  data  has  a  random  component  to 
it  (error  is  unifonnly  distributed  about  the  mean)  the  simulated  data  does  also.  For  this 
reason,  we  can’t  just  take  one  sample  data  set,  analyze  it  and  draw  conclusions;  we  need 
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additional  data  sets  to  analyze.  For  our  purposes,  we  generate  eight  sets  of  simulated 
data.  A  sample  of  this  data  is  given  in  Table  3-1. 

On  the  other  hand,  in  hindsight,  the  variance  of  the  random  error  terms  that  we 
have  simulated  is  so  small  that  we  essentially  are  fitting  models  to  the  first  50  values 
expected  from  the  log-linear  model.  Thus,  our  results  provide  a  measure  of  how  well  the 
fitted  models  reproduce  the  expected  behavior  of  a  log-linear  learning  curve. 


Table  3-1  Fifty  Unit  EXCEL-Simulated  Log-Linear  Data  Set 
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36 
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46 
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17 

377.53 

27. 

31374 

;  -37 

30631 

47 
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8 

18 

357.46 
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33661 

:  38 

281.67 

1  48 

29657 

,  9 

;  19 

38215 

!■! 

321.91 

30 

31045 

49 

277.83 

.10 

51301 

20. 

41690 

30 

365C4 

:  40 

33218 

■B 

^90 

As  a  preliminary  investigation  into  the  potential  of  candidate  models  to  fit  and 
forecast  log-linear  types  of  data,  we  use  FORECAST-PRO  to  fit  selected  ARMA(p,q) 
models  to  each  of  the  eight  data  sets.  The  models  we  fit  are  AR(1),  AR(2),  AR(3), 
AR(4),  MA(1),  MA(2),  MA(3),  MA(4),  ARMA(1,1),  ARMA(1,2),  ARMA(2,1),  and 
ARMA(2,2).  Since  we  are  assuming  that  the  process  underlying  the  actual  aircraft  build 
time-series  is  stationary  (which,  however,  is  not  strictly  time  for  the  log-linear  data),  we 
choose  not  to  fit  any  ARIMA(p,d,qld  0)  to  the  simulated  data.  While  using  Forecast- 
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Pro,  curiosity  also  dictates  that  we  run  its  Expert  Data  Exploration  routine,  and  also  fit 
various  Exponential  Smoothing  models  to  the  data. 

Using  Forecast-Pro,  we  fit  the  selected  models  to  the  simulated  log-linear  data. 

The  output  standard  diagnostics  (including  R^  Adjusted  R^,  Durbin-Watson  and  Ljung- 
Box  test  statistcis,  MAPE^  MAD^,  BIC^,  RMSE'^,  and  the  Forecast  Errors)  from 
Forecast-Pro  are  given  in  Appendix  A.  In  addition  to  the  standard  diagnostics.  Appendix 
A  contains  plots  of  the  data  (the  models  for  only  one  of  the  eight  repetitions  is  shown), 
including  the  fitted  curves,  forecasts  with  confidence  intervals,  and  model  parameter 
estimations  along  with  their  associated  Standard  Errors,  t-Statistics,  and  Significance 
levels.  This  data  is  consolidated  into  one  table.  Table  A-1.  The  data  is  reduced  to  essential 
measures  of  model  appropriateness  (averaged  across  all  eight  repetitions)  which  is 
included,  below,  in  Table  3-2.  Tables  similar  to  Table  3-2  ai'e  also  given  in  Appendix  A 
for  each  of  the  eight  repetitions;  they  are  given  as  Tables  A-2  through  A-9. 


'  MAPE  -  Mean  Absolute  Percent  Error 
^  MAD  -  Mean  Absolute  Deviation 
^  BIC  -  Bayesian  Information  Criterion 
*  RMSE  -  Root  Mean  Square  Error 
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Table  3-2  Model  Measures  of  Appropriateness  Summary 


I  I 

Four  Best  Ftesults  in  Category 

IliiBiiiiiiliae '  ' . 

Overall  best  in  Category 


Simulation  Model 

RSQR 

ADJRSQ 

MAPE 

MAD 

RMSE 

Simple  Exp.  Smoothing 

0.8905  = 

0,8908 

0,0639 

27.4850 

¥45,4950 

SMA(1) 

:  Q.834S 

0.8394 

0.IS19 

27,9463 

.459000 

AR(1) 

;  0.8976 

0.8988 

0.0639 

27.3638 

■  ::'44.96S0  . 

AR(2) 

0.9126 

0.9105 

0.0621 

26.2300 

39.8150 

AR(3) 

0.9243 

0.9211 

0.0619 

25.4559 

37.3200 

AR(4) 

0.9350 

0.9308 

0.0607 

24.8680 

34.7200 

MA(1) 

0,7077 

•  0.7077 

■  0.1377 

54.4250 

76.3150  : 

MA(2) 

0.8619 

0,8500 

0A019 

39.9700  ■ 

■.•■•■51,9988 

MA(3) 

i  0.9027 

0.8986 

0.0821 

:^46S8 

42.4975 

MA(4) 

0.9223 

0.9173  1 

0.1860  ; 

29.8850 

37.5200 

ARMA(1,1) 

0.9066 

0.9047 

0.0641 

27.1563 

42.1275 

ARMA(2,1) 

0.9161 

0.9125 

0.0626 

26.0114 

38.7388 

ARMA(1,2) 

0.9381 

0.9224 

0.0615 

25.2964 

36.6213 

ARMA(2,2) 

0.9261 

0.9217 

0.0610 

24.9766 

36.1988 

AR  Average 

0.9174 

0.9153 

0.0621 

:  25.9794 

39.2050 

MA  Average 

:  0.8487 

0.8456 

0.1269 

39A872':" 

62.0828 

ARMA  Average 

0.9217 

0.9153 

0.0623 

25.8602 

38.4216 

sum 

12.4763 

12.4352 

1.2213 

419.5383 

610.2325 

average 

0.8912 

0.8882 

0.0872 

29.9670 

43.5880 

Observing  the  data  contained  in  Table  3-2,  we  note  the  following: 

(1)  Starting  at  the  top  of  Table  3-2,  we  see  that  Simple  Exponential  Smoothing,  and 
SMA(l)^,  turn  in  a  relatively  poor  performance.  The  statistics  for  simple  exponential 
smoothing  ai'e  below  average®  in  the  categories  of  RSQR,  and  RMSE,  but  are  above 


^  SMA  -  Single  Moving  Average,  in  which  tlie  forecast  for  time  t+1  is  simply  tlie  observation  at  time  t 
®  The  statistics  which  are  calculated  (and  which  are  displayed  in  Table  3-2)  for  the  various  models  under 
study  in  tliis  section  cannot  all  be  compared  diiectlly.  For  example,  tlie  RSQR  and  ADJRSQ  statistics  are 
considered  better  if  their  magnitude  is  larger.  The  M  APE,  MAD,  and  RMSE,  being  measures  of  error  are 
considered  to  be  better  if  their  magnitudes  are  smaller.  So,  for  purposes  of  clarity,  we  refer  to  statistics  as 
being  better  or  worse  tlian  average.  A  better  tlian  average  RSQR  statistic  is  a  statistic  which  is  larger 
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average  in  the  categories  of  MAPE,  AD JRSQ,  and  MAD.  Four  out  of  five  of  the 
statistics  for  SMA(l)  are  below  average;  the  only  above  average  statistic  is  in  the  category 
of  MAD. 

(2)  The  AR  models  seem  to  do  quite  well.  AR(1)  and  AR(2)  have  above  average 
statistics  in  all  but  one  category;  AR(1)  is  below  average  in  RMSE.  AR(3)  and  AR(4)  do 
remarkably  well  in  that  their  statistics  are  among  the  best  four  statistics  in  every  single 
category. 

(3)  As  expected,  the  MA  models  have  generally  the  poorest  results  in  the 
investigation.  MA(1)  and  MA(2)  have  below  average  scores  in  every  single  category. 

The  MA(3)  and  MA(4)  models  do  better;  each  of  them  are  above  average  in  four  of  the 
five  categories;  MA(3)  is  below  average  in  the  MAD  category  while  MA(4)  is  below 
average  in  the  MAPE  category. 

(4)  Like  the  AR  models,  all  of  the  ARMA  models  are  above  average  in  all  categories; 
ARMA(1,2)  and  ARMA(2,2)  have  statistics  which  ai'e  among  the  best  four  in  every  single 
categoiy. 

The  overall  results  seem  to  suggest  that  tlie  Simple  Exponential  Smoothing  model 
and  the  Simple  Moving  Average  model  do  a  mediocre  job  of  fitting  the  simulated  log- 
linear  learning  cuive  data.  Consider  the  AR,  MA,  and  ARMA  averages  given  at  the 
bottom  of  the  table;  they  summarize  the  over-all  findings  quite  well.  The  AR  models, 
overall,  are  better  than  average  in  all  categories,  are  best  in  the  category  of  MAPE,  and 
are  tied  for  best  with  the  ARMA  models  in  the  categoiy  of  ADJRSQ.  Overall,  the  MA 
models  turn  in  a  below  average  statistic  in  eveiy  single  categoiy.  Finally,  overall,  the 

than  the  average  RSQR,  while  a  better  than  average  MAPE  statistic  is  lower  than  tlie  average  MAPE 
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ARMA  models  do  very  well  in  fitting  the  data.  They  had  the  best  average  statistic  in  all 
categories  but  one;  they  were  simply  above  average  in  the  category  of  MAPE. 

What  we’ve  established  is  that  the  AR  and  ARMA  models  generally  seem  well 
suited  to  fitting  the  log-linear  data  despite  the  fact  that  the  true  log-linear  model  has  a 
nonstationary  mean.  Observation  of  the  statistics  reveals,  however,  that  the  statistics 
differ  veiy  little  from  one  model  to  another  within  each  category.  The  fact  that  one  is 
better  than  another  is  just  a  puff  of  air.  The  only  serious  difference  between  statistics  from 
one  model  to  another  occurs  in  the  MA(1)  and  MA(2)  models  which  seem  to  have  much 
worse  statistics  than  the  other  models. 

Now  that  this  analysis  is  complete,  we  move  on  to  a  forum  which  provides  equal 
treatment  to  all  potential  models’  including  the  modified  learning  curve  models  discussed 
in  Chapter  2.  This  is  the  topic  on  which  we  concentrate  our  energy  in  Chapter  4, 


statistic. 

^  Forecast-Pro’s  repertoire  of  models  does  not  include  models  based  on  tlie  leaniing-curve. 
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4.  A  Comprehensive  Model  Comparison 


4-1.  Introduction 

To  be  useful  in  program  management,  a  model  must  be  accurate.  This  raises  a 
very  important  question,  “How  do  we  pick  a  model  from  among  the  models  we’ve  been 
considering  thus  far,  that  will  provide  reasonably  accurate  predictions  of  future  build 
times?” 

To  help  focus  our  sights,  our  next  effort  is  organized  into  three  main  stages  which 
are  discussed  in  detail  in  Sections  4-2, 4-3,  and  4-4,  respectively.  In  each  stage  we  fit  the 
candidate  models  to  a  particular  set  of  data,  using  the  spreadsheet  program  EXCEL  to 
estimate  the  various  model  parameters.  The  purpose  of  using  EXCEL  is  to  put  each 
model  into  an  environment  where  it  may  be  compared  with  the  other  potential  models  on 
an  equal  basis.  (Forecast  Pro  is  good  for  helping  choose  some  promising  ARIMA  models 
but  since  it  doesn’t  have  a  facility  to  fit  learning  curve  models,  it  really  can’t  be  used  to 
complete  the  analysis;  there’s  just  no  conunon  footing.) 

The  first  stage,  discussed  in  Section  4-2,  uses  a  simulated  log-linear  data  base  to 
test  the  potential  models’  ability  to  fit  a  log-linear  curve.  The  second  stage,  discussed  in 
Section  4-3,  uses  data  obtained  from  the  F-102  production  program  to 

1.  test  the  potential  models’  ability  io  fit  the  complete  data  set,  and  to 

2.  test,  using  a  hold-out  set*,  the  potential  models’  ability  to  forecast  future  build 
times. 


’  ‘Hold  out  set,’  is  a  reference  to  the  method  of  using  a  portion  (x  out  of  n  observations)  of  a  given  set  of 
data  to  develop  a  model.  The  resulting  model  is  then  used  to  forecast  the  next  n-x  values  in  the  series;  the 
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Finally,  the  third  phase,  discussed  in  Section  4-4,  sets  out  to  forecast  future  build  times  in 
the  C-17  program. 

4-2.  Fitting  Simulated  Log-Linear  Data 

In  this  part  of  the  investigation,  we  take  a  closer  look  at  each  of  the  most 
promising  Log-Linear  and  ARM  A  models  identified  in  Chapters  2  and  3.  Each  of  the 
models  is  used  to  fit  a  data  base  consisting  of  a  50-unit  sequence  of  simulated  log-linear 
data. 

First,  EXCEL  is  used  to  generate  fifty  observations  of  log  linear  data.  The 
equation  used  is 

Y  =  aX‘’+ error  (4-1) 

where  as  in  Chapter  3, 

a  is  the  first  unit  cost  and  is  taken  to  be  1000, 

X,  is  the  unit  number,  and  ranges  from  1  to  50, 
b  is  the  learning  rate,  and  is  taken  to  be  -0.3219  (80%  curve),  and 
the  error  is  uniformly  distributed  (one  percent  above  and  one  percent  below)  about 
the  expected  value  of  Y  for  each  unit  X. 

Once  the  data  is  generated,  and  the  initial  individual  models  are  constructed, 
EXCEL’S  Solver  Function  is  used  to  find  the  parameter  values  that  minimize  the  Sum  of 
Squared  Errors  (SSE).  Each  error  is  the  residual  computed  by  calculating  the  difference 


forecasted  values  can  then  be  compared  to  the  hold  out  set  (the  remaining  observations  from  the  given  data 
set)  to  determine  the  fidelity  of  the  fitted  model. 
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between  the  fitted  values  and  the  actual  values^.  We  deem  the  model  with  the  smallest 
SSE  to  be  the  one  which  provides  the  best  fit  to  the  data. 

The  models  and  their  respective  equations  are  given  in  Table  B-1  in  Appendix  B. 
Also  included  in  this  table  are  the  parameter  estimates  which  result  from  minimizing 
SSE.  Note  that  asterisked  parameters  are  the  parameters  which  are  adjustable  during  the 
optimization  process.  This  table  also  includes  the  minimized  values  obtained  for  SSE. 

The  results  are  summarized  below  in  Table  4-1  (This  table  is  the  same  as  Table  B- 
2  in  Appendix  B.).  This  table  is  broken  into  two  groups  of  model  data.  The  first  group 
summarizes  the  learning-curve  models  and  the  second  group  summarizes  the  ARIMA 
models.  Note,  also,  that  this  table  displays  the  rank  of  each  model’s  SSE  across  the  two 
groups.  On  the  basis  of  SSE  alone,  the  best  overall  model  for  curve  fitting  seems  to  be 
the  basic  Log-Linear  Learning  Curve  model  which  comes  in  with  a  low  SSE  total  of 
763.7;  this  seems  like  an  intuitive  result  since  this  is  the  same  model  which  was  used  to 
produce  the  simulated  data  in  the  first  place.  The  best  ARIMA  model  for  fitting  this  curve 
seems  to  be  the  AR(4)  Model  which  comes  in  with  an  SSE  of  1438.8.  From  this  table, 
we  also  see  that,  based  once  again  on  the  magnitude  of  SSE,  the  worst  models  for  fitting 
seem  to  be  the  S-Curve,  Pegel,  and  MA(1)  models. 

The  SSE  of  the  MA(1)  and  Pegel  models  turn  out  to  be  one  and  two  orders  of 
magnitude  larger  than  the  average  SSE,  which  suggests  they  might  be  inferior  to  the  other 
models.  The  poor  performance  of  these  models  in  fitting  the  Log-Linear  Data  base  lead  us 
to  believe  that  they  may  also  do  poorly  at  forecasting  this  type  of  data.  The  Pegel  model 


^  In  this  thesis,  the  term  ‘fitted  value’  refers  to  the  per-unit  value  produced  by  the  model  which  is  fitted  to 
the  simulated  data.  The  term  ‘actual  value’  refers  to  the  (simulated)  data  to  which  the  model  is  being  fit. 
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is  similar  to  the  Forsythe  model  in  that  it  contains  a  minimum  cost  term.  Since  the 
Forsythe  model  performed  so  much  better  than  the  Pegel  in  this  fit  test,  we  chose  to  keep 
it  and  to  discard  the  Pegel  model  from  futher  consideration  in  this  chapter.  The  results 
also  confirm  our  suspicions  from  Chapter  3  regarding  the  MA(1)  model;  we  recall  that 
the  MA(1)  model  did  not  appear  to  be  well  suited  to  modeling  learning  curves  since 
rather  than  showing  an  extended  downward  trend  similar  to  learning,  the  curve  had 
reached  its  mean  within  just  two  time  periods.  Retaining  the  S-Curve  for  further  study  is 
questionable  also  since  it  not  only  performed  poorly  in  the  fit  test,  but  also  suffers  from 
high  complexity.  For  the  reasons  cited  above,  we  drop  these  three  models  (MA(1),  Pegel, 
and  S-Curve)  from  further  consideration  in  this  thesis. 

T able  4- 1  Summary  of  Log-Linear  Fitting 


'iModel': 

Rank 

?  ■^LdgyLiiilsB’  'S: 
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1 

:  Fdiisyi|lie ' 
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4 
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11 

242678.9 

13 
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43241.7 

12 

2665105.0 

14 

"'1 

9685.7 

10 

'-J 

4851.1 

7 

1756.4 

3 

1438.8 

2 

5856.5 

8 

6274.5 

9 

illiARlsifel);;. 

3714.9 

5 

4468.4 

6 

Plots  which  include  the  original  data  as  well  as  the  fitted  curve  for  each  model,  are 
contained  in  Appendix  B. 


4-4 


4-3.  Fitting  Historical  F-102  Data 


The  data  set  for  this  section  was  obtained  from  the  document  entitled  Cost 
Functions  for  Airframe  Production  Programs  (Womer,  1982)  which  draws  its  data  from 
the  F-102  Program  Cost  History  (1965).  This  report  says  very  little  about  the  data.  It 
tells  us  that  the  FI 02  program  was  comprised  of  1000  aircraft  which  were  constructed 
during  the  years  1953  through  1958.  Further,  it  tells  us  that  of  these  1000  aircraft,  889 
are  F102A  interceptors  and  1 1 1  are  TF102  trainers.  Unfortunately,  we  are  not  told  which 
aircraft  are  which  within  the  data  base,  and  can  therefore  not  adjust  for  any  irregularities 
this  might  produce  in  the  data. 

The  variable  of  primary  interest  in  the  data  base  is  the  direct  labor  hours  for  each 
airframe;  this  is  the  column  in  Table  E-1  (Appendix  E)  entitled  TOTHRS.  A  portion  of 
this  table  is  shown  in  Table  4-2  for  the  reader’s  convenience. 


Table  4-2  A  Sample  of  the  Historical  F-102  Data  Base 


f" 

F«t02  Data  Base 

OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

1 

1 

1 

402475 

1 

5942 

1 

2 

2 

2 

375849 

1 

5942 

3 

3 

3 

3 

278963 

2 

5942 

7 

4 

4 

4 

271223 

2 

5942 

7 

5 

5 

5 

262498 

2 

5942 

8 

6 

6 

6 

258078 

2 

5942 

9 

7 

7 

7 

243726 

2 

5942 

10 

8 

8 

8 

232766 

2 

5942 

10 

9 

9 

9 

220833 

2 

5942 

11 

10 

10 

10 

218827 

2 

5942 

12 

11 

11 

11 

322447 

3 

5942 

15 

12 

12 

12 

306736 

3 

5942 

16 

13 

13 

13 

290470 

3 

5942 

17 

14 

14 

14 

282951 

3 

5942 

18 

15 

15 

15 

233125 

4 

5942 

21 

^  The  full  data  set  is  given  in  Appendix  E  .  See  the  Epilogue  at  the  end  of  this  thesis  for  a  short 
description  of  the  search  for  this  data. 
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The  first  question  which  arises  is,  “Against  what  should  TOTHRS  be  plotted?”  This  is  an 
important  question  because  we  note  that  there  are  two  columns  in  the  data  base  with 
headings  that  suggest  they  would  be  good  candidates  for  the  abscissa  or  unit  number.  The 
first  possible  column  is  labeled  OBS;  the  second  possible  column  is  labeled  PLN.  Since 
we  have  no  written  description  of  the  data  in  these  columns,  we  make  the  following 
assumptions.  We  assume  OBS  is  an  indication  of  the  relative  position  of  an  aircraft  with 
respect  to  the  end  of  the  manufacturing  process.  For  example,  the  first  aircraft  off  the 
assembly  line  would  be  OBS  1,  the  second  would  be  OBS  2,  etc.  We  assume  that  the 
OBS  number  has  nothing  whatsoever  to  do  with  the  position,  in  the  manufacturing  line, 
in  which  the  plane  started.  We  next  assume  PLN  is  an  indication  of  the  relative  position 
of  an  aircraft  with  respect  to  the  start  of  the  manufacturing  process.  For  example,  the  first 
aircraft  started  on  the  manufacturing  line  would  be  PLN  1,  the  second  would  be  PLN  2, 
etc.  We  assume  that  the  PLN  number  has  nothing  to  do  with  the  position  in  which  the 
aircraft /inw/icr  the  manufacturing  process.  For  example,  PLN  1  might,  actually,  not  be 
completed  until  after  PLN  2.  So  we  could  have  a  situation  where  PLN  1  could  be  the 
same  aircraft  as  OBS  2!  How  do  we  decide  which  column  of  data  to  use? 

To  help  us  decide,  we  use  EXCEL  to  sort  the  data  first  by  OBS  number  and  then 
by  PLN  number.  We  generate  plots  for  each  of  these  reordered  data  sets  and  present 
these  in  Figures  4-1  and  4-2.  The  most  obvious  problem  seen  in  the  figures  is  the  large 
jump  and  then  decline  in  the  TOTHRS  data  at  about  PLN  (and  OBS)  11-15.  Since  it 
occurs  in  the  both  the  plots  of  TOTHRS  vs  PLN  and  TOTHRS  vs  OBS,  we  can  not 
choose  which  column  to  use  based  on  this  characteristic.  Looking  again,  we  also  see 
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some  small  perturbations  (spikes)  in  the  dataplots.  From  these  we  choose  to  use 
TOTHRS  vs  PLN  since  the  perturbations  are  much  smaller  in  this  reordered  series. 

Figure  4-1  Plot  of  Data  —  TOTHRS  vs  OBS 


TOTHRS  Vs.  OBS 


Figure  4-2  Plot  of  Data  —  TOTHRS  vs  PLN 


TOTHRS  vs  PLN 


Adjusting  for  the  ‘jump’  in  the  data  which  appears  in  both  figures  is  just  a  little 
difficult  since  we  have  no  documentation  and  can’t  be  sure  of  the  cause  for  this  jump;  it 
could  be  a  major  modification  in  aircraft  design  which  required  new  learning.  We  could 
either  ignore  the  jump  or  we  could  make  some  kind  of  adjustment  for  it  so  the  models 
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might  deal  better  with  it.  We  seem  to  have  two  options  for  adjustment:  we  could  either 
start  at  PLN  number  one  and  skip‘s  Lx)t  3  (see  column  5  in  Table  4-2)  Appendix  D  or 
column  or  we  could  just  skip  the  first  10  observations  (PLNs)  and  start  the  data  at  PLN 
11.  We  arbitrarily  decide  to  use  the  data  starting  at  PLN  11.  The  resulting  plot  is  shown 
in  Figure  4-3. 

Figure  4-3  F-102  Data  (PLN  vs  TOTHRS)  Starting  at  PLN  1 1 


F-102  Data  TOTHRS  vs  PLN  Starting  at  PLN  1 1 


Once  the  data  is  adjusted,  as  described  above,  we  are  ready  to  start  exercising  the 
models.  First  thing  we  do  is  use  EXCEL  the  same  way  we  did  in  Section  2,  to  see  how 
well  each  of  the  potential  models  can  fit  the  F-102  data.  As  before,  EXCEL’s  Solver 
Function  is  used  to  find  the  model  parameters  that  minimize  the  Sum  of  Squared  Errors 
(SSE).  The  models  and  their  respective  equations  are  given  in  Appendix  C  in  Table  C-1. 
Also  included  in  this  table  are  the  parameter  estimates  that  result  from  the  use  of  the 
solver.  Asterisked  parameters  are  the  parameters  which  are  adjustable  during  the 


''  Lot  3  contains  all  four  of  the  points  which  make  up  the  ‘jump.’ 
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optimization  process.  The  table  also  includes  the  minimum  values  for  SSE.  We  again 
use  SSE  as  the  measure  of  performance  to  compare  the  models  against  one  another. 

The  results  of  using  the  models  to  fit  the  F-102  data  are  summarized  in  Table  4-3 
(This  table  is  the  same  as  Table  C-2  in  Appendix  C.).  Like  Table  4-1,  Table  4-3  is 
broken  into  two  groups  of  model  data.  The  first  group  summarizes  the  learning-curve 
models  and  the  second  group  summarizes  the  ARIMA  models.  Note,  also,  that  the  table 
contains  the  ranks  for  each  model  with  respect  to  SSE  across  the  two  groups.  Plots  which 
include  the  original  data  as  well  as  the  fitted  curve  for  each  model,  are  contained  in 
Appendix  C. 


Table  4-3  Summary  of  F-102  Curve  Fitting  Investigation 


j'l:  Model 

SSE 

Rank-: 

Log-Linear 

5.503E+10 

10 

forsrythe 

5.503E+10 

10 

4.540E-I-10 

9 

....-ATO'-l 

2.722E+10 

8 

4.930E+09 

2 

2.637E+10 

7 

4.877E+09 

1 

2.089E-H10 

6 

2.079E+10 

4 

?fARMA(2,l)\ 

2.087E-I-10 

5 

fBrARMA{2,2)-i: 

2.079E+10 

3 

The  best  over-all  model  for  curve-fitting  appears  to  be  the  AR(4)  model  which  comes  in 
with  a  low  SSE  total  of  4.877e9.  The  best  learning  curve  model  for  fitting  this  curve  is 
the  Stanford-B  Model  which  comes  in  with  an  SSE  of  4.54el0.  Note  that  the  SSE 
values,  and  hence  the  ranks,  for  the  Standard  Log-Linear  and  the  Forsythe  models  are  the 
same.  This  occurs  because  in  this  case,  the  solver  estimated  the  parameter  Cmin  to  be  zero 
in  the  Forsythe  model  (Y  =  aX'’  +  Cmin);  when  this  occurs,  the  Forsythe  model  is  the  same 
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as  the  Standard  Log-Linear  model  (Y  =  aX’’).  For  purposes  of  comparison,  plots  of  both 
fitted  models  as  well  as  the  F-102  data  are  shown  in  Figure  4-4.  Note  that  although  the 
plots  below  show  the  data  and  fitted  model  up  through  the  200th  observation,  the  fit  was 
actually  performed  using  the  observations  up  through  500. 


Figure  4-4  Forsythe  Model  vs.  Log-Linear  Model  in  F-102  Fit 


What  have  we  done  here?  All  we’ve  done  is  use  our  potential  models  to  see 
which  can  most  closely  follow  the  input  data.  In  Section  4-2,  we  checked  the  models 
against  a  simulated  log-linear  data  set.  In  Section  4-3,  we  checked  the  models  against  the 
F-102  data  set.  These  experiments  are  good  for  giving  us  an  idea  which  models  might 
give  a  reasonable /it  to  this  type  of  data  but  the  only  way  to  really  find  out  which  ones 
work  and  which  ones  don’t  work  in  forecasting,  is  to  actually  forecast  with  them! ! 

4-4.  Fit/Forecasting  Historical  F-102  Data 

In  this  section,  we  use  the  F-102  data  to  test  the  ability  of  the  potential  models  to 
forecast  future  build  times.  To  accomplish  this  we  use  each  of  the  models  to  fit  the  first 
twenty  observations  in  the  data  set^.  This  is  done  in  a  manner  similar  to  that  used  in 
Sections  4-2  and  4-3;  we  use  EXCEL’s  solver  to  minimize  the  SSE  by  adjusting  the 


^  Recall  that  the  data  set  starts  at  PLN  Number  1 1 ;  the  first  twenty  observations,  therefore,  consist  of  PLN 
Number  1 1  through  PLN  Number  30.  The  forecast  starts  with  PLN  Number  31. 
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parameter  vzilues.  The  output  of  the  solver  includes  not  only  the  minimized  SSE,  but  the 
parameter  estimates  for  each  of  the  models.  We  use  these  parameters  to  build  the  forecast 
models.  Once  the  forecast  models  are  built,  we  use  each  of  them  to  forecast  out  to  unit 
500.  Then  two  more  SSE’s  are  calculated  based  on  the  hold  out  set  of  480  and  the  full 
series  of  500,  as  measures  of  each  model’s  forecasting  fidelity. 

An  example  plot  which  includes  the  original  data  as  well  as  the  fitted  AR(2) 
model  (best  forecast)  is  shown  in  Figure  4-5.  Although  the  AR(2)  provided  the  best 
forecast,  we  can  see  that  the  fit  still  isn’t  very  good.  The  plots  for  the  remainder  of  the 
models’  forecasts  are  contained  in  Figures  D-1  through  D-1 1  in  Appendix  D. 

Figure  4-5  AR(2)  Forecast  of  F-102  Data 


Rrecast  cf  F102  (uang  20  irit  lidcx^ 


The  rank  results  of  the  forecasting  are  summarized  in  Table  4-4  (This  table  is  the 
same  as  Table  D-2.). 
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Table  4-4  Summary  of  the  F-102  Fit/Forecast  Investigation 


Mom 

SSB(lst20) 

SSE  Oast  480) 

Rank  (1st  20) 

Rark  (last  480) 

Loe>lJiiear 

2.965E-+09 

6.175E+11 

10 

4 

Fwsiytfae 

2.965E-+09 

6.175E+11 

10 

4 

StaninidyB- 

2.814E-t09 

1.278E+10 

9 

2 

2.452E-H)9 

1.073E+12 

8 

6 

IIMilLliW 

2.350E-1O9 

7.064E-+09 

6 

1 

AR(® 

2257E-+09 

2158E+12 

5 

9 

A8(# 

1.707E-+09 

3.892E+10 

4 

3 

ARMA(1,1) 

2414E409 

1.034E+12 

7 

5 

AKMA(J42) 

1.153E-+09 

1.293E+12 

3 

7 

8.848E408 

3.319E+12 

2 

10 

8.793E408 

1.457E+12 

1 

8 

If  we  examine  the  table,  we  see  that  it  not  only  includes  the  SSE  for  the  first  20 
observations  (the  ‘fitting’  set)  and  for  the  last  480  observations  (the  ‘hold-out’  set),  it 
includes  ranking  data  which  gives  first  place  to  the  model  with  the  lowest  SSE.  We  can 
think  of  the  SSE  for  the  ‘First  20’  as  a  measure  the  goodness  of  the  model’s  fit  and  the 
SSE  for  the  ‘Last  480’  as  a  measure  of  the  goodness  of  the  model’s  forecast.  Note  that 
the  ARMA(2,2)  has  the  lowest  SSE  for  the  first  twenty  observations  but  ended  up  with 
only  the  eighth  lowest  SSE  for  the  last  480  observations.  This  indicates  that  although  this 
model  provided  the  closest  fit  on  the  first  twenty,  it  did  not  provide  the  best  forecast  for 
the  entire  sequence.  The  best  forecast  is  provided  by  AR(2)  which  came  in  as  the  sixth 
best  fit  for  the  first  20.  A  model  which  provides  the  best  forecast  doesn’t  necessarily 
provide  the  best  fit.  Once  again,  the  SSE  values,  as  well  as  the  ranks  for  the  Standard 
Log-Linear  and  Forsythe  models  are  the  same  because  Cmin  in  the  Forsythe  model  was 


estimated  to  be  zero  by  EXCEL’ s  solver. 
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4-5.  Fit/Forecasting  Notional  C-17  Data 

In  this  section  we  use  notional  C-17  data  to  further  test  the  ability  of  the  models  to 
fit/forecast  future  aircraft  build  times.  Since  the  actual  C-17  data  is  proprietary,  we 
rescaled  it  so  it  would  not  show  the  actual  production  capability  of  the  contractor.  The 
data  we  use  in  this  section,  therefore,  is  not  actual  C-17  data  but  notional  data. 

Since  the  number  of  hours  seems  to  increase  between  units  1  and  3.5,  we  start  our 
test  data  set  at  observation  3.5.  We  are  not  sure  why  the  hours  increase  within  this  range, 
but  we  can  speculate  that  perhaps  the  early  production  units  were  used  for  some  kind  of 
testing  which  resulted  in  their  completion  taking  longer  than  it  normally  might  have.  In 
order  to  give  the  models  a  slightly  less  confusing  data  set  on  which  to  work,  we  left  out 
these  observations  and  started,  simply,  at  observation  3.5. 

To  test  the  ability  of  our  potential  models  to  forecast  we  proceed  in  a  manner 
similar  to  that  used  in  the  previous  section.  We  use  each  of  the  models  to  fit  the  first 
fifteen  observations  (observations  3.5  through  18)  in  the  data  set^.  We  then  use  EXCEL’s 
solver  to  minimize  the  SSE  by  adjusting  the  parameter  values.  Once  again,  the  output  of 
the  solver  includes  not  only  the  minimized  SSE,  but  the  parameter  estimates  for  each  of 
the  models.  We  use  these  parameters  to  build  the  forecast  models.  Once  the  forecast 
models  are  built,  we  use  each  of  them  to  forecast  build  times  for  units  19  through  23. 
Finally,  as  before,  two  more  SSE’s  are  calculated  based  on  the  ‘fitting  set’  of  15  and  the 
‘hold-out  set’  of  8;  these  are  utilized  as  measures  of  each  model’s  forecasting  fidelity. 
The  results  of  the  forecasting  are  summarized  in  Table  4-5  (This  table  is  the  same  as 

®  Recall  that  the  data  set  starts  at  PLN  Number  3.5;  the  first  fifteen  observations,  therefore,  consist  of 
Observation  Number  3.5  through  Observation  Number  18.  The  forecast  starts  with  Observation  Number 
19. 
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Table  F-2  in  Appendix  F.).  Plots  which  include  the  original  data  as  well  as  the  fitted 


curve  for  each  model,  are  contained  in  Appendix  F. 


Table  4-5  Summary  of  the  Notional  C-17  Fit/Forecast  Investigation 


MmM 

SSE(lstl5) 

SSE  (aft  23) 

SSE  (last  8) 

Raffle  (1st  15) 

Rank  (an  23)  Raffle  (8) 

870.7 

1040.1 

169.4 

6 

6 

3 

forgrffiie 

291.1 

386.5 

95.4 

2 

2 

2 

Sauifoid-B 

359.2 

364.7 

5.5 

5 

1 

1 

1266.9 

1727.7 

460.8 

7 

8 

10 

1340.4 

1753.9 

413.6 

9 

9 

9 

1405.5 

1754.0 

348.4 

10 

10 

8 

1302.1 

1503.5 

201.4 

8 

7 

4 

89.9 

398.1 

308.2 

1 

3 

7 

AKM4a^) 

3699.0 

10270.0 

6571.0 

11 

11 

11 

ARMA(2,1) 

329.9 

572.3 

242.3 

4 

5 

6 

302.9 

524.1 

221.2 

3 

4 

5 

Note  that  the  table  not  only  has  the  actual  SSE’s  for  the  fit  set,  the  hold-out  set, 
and  the  entire  set,  it  has  corresponding  columns  of  ranks  for  each  model.  From  table  4-5, 


we  see  that  based  on  the  SSE  values  of  the  hold-out  set,  the  first  and  second  best  forecasts 
were  turned  in  by  the  Stanford-B  and  the  Forsythe  Models  with  SSEs  of  5.5  and  95.4 


respectively. 

4-6.  Summary 

The  results  of  this  chapter’s  investigations  is  summarized  in  Table  4-6. 


Table  4-6  Investigation  Summary  —  Ranks  Based  on  SSE’s 


tnvestiaeftibns: 

MocMs/Rante:  '*1' 

(Section  4*2) 
Fitting 

Data 

(Section  4*0) 
Fitting 
F-102 

Data 

(Sec^on  4*3) 
FIt/Forecaat 
F*102 
Data 

(Section  4*4) 
FIt/Forecaet 
Notional  C*17 
Data 

First  Best 

Log-Linear 

AR(4) 

AR(2) 

Stanford-B 

SeeondBest 

AR(4) 

AR(2) 

Stanford-B 

Forsythe 

l^ird  Best 

AR(3) 

ARMA(2,2) 

AR(4) 

Log-Linear 

Fourth  Best 

Forsythe 

ARMA(1,2) 

Log/Lin 

and 

Forsythe 

AR(4) 
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In  Section  4-2,  we  investigated  the  ability  of  the  candidate  models  to  fit  a 
simulated  log-linear  data  set.  The  standard  Log-Linear  Learning  Curve  model  provided 
the  best  fit  to  the  data;  this  makes  perfect  sense  since  it  was  the  standard  Log-Linear 
Learning  curve  which  produced  the  data  in  the  first  place!  The  second  and  third  best  fits 
were  provided  by  the  AR(4)  and  AR(3)  models  respectively;  their  SSE  statistics  were 
very  nearly  the  same  (1438.8  vs  1756.4)  so  if  we  had  to  chose  a  model  to  fit  log-linear 
data  based  on  this  investigation,  the  concept  of  parsimony  would  probably  have  us  choose 
AR(3). 

In  section  4-3,  we  investigated  the  ability  of  the  candidate  models  to  fit  historical 
F-102  data.  In  the  fit  test,  the  models  which  provided  the  first  and  second  best  fits  to  the 
500  observation  data  set  were  the  AR(4)  and  AR(2)  models  respectively.  Their  SSE 
statistics  were  very  close  (4.977E+09  vs  4.930E+09),  so  if  we  had  to  choose  a  model  to 
fit  the  F-102  the  concept  of  parsimony  would  have  us  choose  the  AR(2)  model. 

Also  in  section  4-3,  we  developed  forecast  models  for  the  F-102  data  by  using 
hold-out  sets;  we  used  the  first  20  observations  to  fit  the  forecast  model  and  went  on  to 
test  it  against  the  480  observation  hold-out  data  set.  Based  on  SSE  values  alone,  the  first 
and  second  best  forecast  models  appear  to  be  the  AR(2)  and  Stanford-B  models 
respectively.  The  SSE  for  the  AR(2)  was  only  about  half  as  large  as  the  SSE  for  the 
Stanford-B.  In  third  place  was  the  AR(4)  model  with  an  SSE  about  twice  the  value  of  the 
Stanford-B  value.  One  interesting  thing  to  note  is  that  the  model  which  provided  the  best 
fit  to  the  first  20  observations  did  not  provide  the  best  forecast.  The  ARMA(2,2)  model 
provided  the  best  fit  to  the  first  20  observations  but  came  in  with  the  8th  best  forecast. 
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In  section  4-4,  we  developed  forecast  models  for  the  notional  C-17  data  by  using 
hold-out  sets.  The  fit  set  consisted  of  the  first  15  observations  while  the  hold  out  set 
consisted  of  the  last  8  observations.  The  first,  second,  and  third  best  forecasts  seem  to  be 
given  by  the  Stanford-B,  the  Forsythe,  and  the  Log-Linear  Learning  Curve  models 
respectively.  Once  again,  the  models  which  provide  the  best  forecasts  did  not  necessarily 
provide  the  best  fits.  The  Stanford-B,  the  Forsythe,  and  the  Log-Linear  Learning  Curve 
models  provided  the  5th,  2nd,  and  6th  best  fits,  respectively. 

The  results  are  somewhat  inconclusive  in  that  the  study  does  not  identify  any  one 
model  as  always  being  the  best  fitting  or  forecasting  tool  for  all  data  sets.  It  does, 
however,  provide  an  important  general  observation  that  ARMA  models  are  a  promising 
alternative  to  the  standard  log-linear  learning  curve  approach  which  is  widely  in  use 
today;  they  are  comparatively  simple  to  use,  intuitive  in  nature  and  seem  to  provide  a 
good  forecast  based  on  a  small  amount  of  data. 
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5.  Simulating  Future  Performance 


5-1.  Problem  Statement 

The  purpose  of  this  thesis  up  to  this  point  has  been  to  investigate  existing  learning  curve 
models  and  to  examine  alternative  models  which,  based  on  initial  production  data,  might  better 
predict  the  amount  of  time  future  aircraft  builds  will  take.  One  of  the  things  we  have  addressed 
only  briefly,  if  at  all,  has  been  the  uncertainty  in  fit  of  the  models  under  study.  In  the  C- 17 
Factory  Simulation  Model,  predictions  of  future  performance  were  made  by  treating  the  fitted 
regression  as  if  it  provided  a  perfect  forecast  of  the  mean  time  required  to  complete  a  task  in  the 
future.  Once  this  relationship  was  determined,  random  errors  were  generated  about  the  fitted 
curve  to  simulate  how  actual  performance  might  vary  in  the  future.  In  this  case,  the  simulation 
team  felt  they  could  adequately  predict  future  performance  by  ignoring  the  uncertainty  of  the 
fitted  regression. 

The  goal  of  this  chapter  is  to  explore  various  methods  for  addressing  the  following 
questions.  Given  a  learning  curve  model,  how  can  we  best  account  for  the  uncertainty  in  fit? 
Should  we  just  ignore  it  as  has  been  done  in  previous  aircraft  manufacturing  simulations?  Is  this 
a  sound  procedure?  Alternately,  should  we  adjust  the  variance  of  future  observations  to  account 
for  uncertainty?  Or  should  we  sample  from  a  distribution  of  parameter  estimates  in  each 
replication?  We  explore  this  through  the  use  of  basic  meta-modeling  concepts  applied  to  the 
learning  curve  model. 
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5-2.  Simulation  Based  on  a  Fitted  Linear  Metamodel 


What  is  a  metamodel?  The  following  definition,  lifted  from  the  dictionary  (Meriam- 
Webster,  1977),  gives  a  small  amount  of  insight  into  the  meaning  of  the  meta  (or  met)  prefix  and 
starts  us  on  our  way  to  understanding  what  a  metamodel  is! 

meta-  or  met-  prefix  [from  Latin,  change,  or  Greek,  among,  with,  after]  la: 
occuring  later  than  or  in  successsion  to  :  after  lb:  situated  behind  or  beyond  Ic:  later  or 
more  highly  organized  or  specialized  form  of  2:  change  :  transformation  3:  more 
comprehensive  :  transcending  <metopsychology>  —  used  with  the  name  of  a  discipline  to 
designate  a  new  but  related  discipline  designed  to  deal  critically  with  the  original  one 
</neramathematics> 

To  put  it  more  succinctly,  Russell  R.  Barton  (Barton,  1992)  says,  “a  metamodel  is  a  model  of  a 
model.” 

Let’s  look  at  an  example  of  a  metamodel.  We  first  use  an  equation,  say  the  learning 
curve  equation  ( Y  =  aX**  +  error),  to  generate  a  data  base;  we  did  this  in  Chapter  3  when  we 
generated  the  first  fifty  observations  of  a  specific  learning  curve.  We  next  use  the  technique  of 
Ordinary  Least  Squares  (OLS  regression),  to  ‘fif  a  predictive  model  to  that  data.  What  we  end 
up  with  are  parameter  estimates  defining  a  fitted  model  which  characterizes  the  data  base  we 
have  generated  (a  model  of  a  model!).  When  we  make  a  (regression)  model  of  the  simulation 
model,  we  speak  of  a  (regression)  metamodel.  (Kleijnen,  1992) 

In  this  section,  we  examine  strategies  for  generating  simulated  values  from  a  fitted  linear 
model.  We,  in  particular,  want  to  simulate  future  performance;  in  other  words,  we  want  to 
extrapolate  outside  of  the  design  region  of  the  fitted  model.  The  case  in  which  we  are  interested, 
and  with  which  we  start,  is  the  case  where  the  independent  variable,  X,  represents  time,  or  some 
function  of  time.  We  assume  we  have  observed  the  dependent  variable,  Y,  for  values  of  the 
independent  variable  ranging  from  X  =  1  to  X  =  n.  What  we  wish  to  do  is  to  repeatedly  simulate 
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future  sequences  of  the  dependent  variable,  Y,  for  values  of  the  independent  variable  ranging 
from  X  =  n+1  to  X  =  n  +  k  for  some  specified  k. 

For  simplicity,  let’s  assume  that  the  relationship  between  a  dependent  variable,  Y,  and  an 
independent  variable,  X,  can  be  described  via  a  simple  linear  model  of  the  form 

Y  =  a  +  (3X  +  e,  (5-1) 

where  e  ~  N(0,  <T^).  If  we  know  the  values  of  the  parameters  a,  P,  and  ,  we  can  say  that 

Y~N(a  +  pX,0^)  (5-2a) 


or,  equivalently. 


lY-g-BX!  ~  N(0,  1).  (5-2b) 

c 

Better  yet,  and  still  equivalently,  we  could  denote  this  as 

Y  =  a+pX+  o[N(0,l)]  (5-2c) 

Where  the  last  term  in  (5-2c)  indicates  that  we  have  added  to  the  mean  (a  +  PX)  a  standard 
deviation  term  which  is  normally  distributed  with  a  mean  of  zero  and  a  standard  deviation  of  one. 
We  can  use  this  to  generate  simulated  values  of  Y  for  given  values  of  X.  If  we  wish  to  generate 
a  value  of  the  dependent  variable  for,  say,  X  =  Xh,  we  can  use  the  following  simple  algorithm: 

1.  Generate  Z  ~  N(0, 1). 

2.  SetY  =  a  +  pXh  +  aZ. 

For  our  purposes  we  assume,  for  simplicity,  that  Xh  =  h.  This  allows  the  simulated  sequences 
(Yn+i,  Yn+2, . . . ,  Yn+k)  to  be  generated  in  a  relatively  straightforward  manner.  So,  when  we 
know  the  values  of  the  parameters  a,  P,  and  a^,  there  is  no  uncertainty  in  the  fit  since  the  mean 
and  standard  deviation  are  known. 
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The  uncertainty  problem  arises  when  we  DOA^Tknow  the  values  of  these  parameters. 
This  leads  us  to  the  first  of  three  methods  of  dealing  with  uncertainty  in  fit. 

Method  1 

If  we  do  not  know  the  values  of  the  parameters  a,  P,  and  which  is  most  often  the  case, 
they  must  be  estimated  from  sample  data.  Let’s  assume  we  observe  the  data  points  (Xi,Yi), 
(X2,Y2),  . . . ,  (Xn,Yn);  ffom  these,  we  can  estimate  a,  p,  and  via  ordinary  least  squares.  The 
least  squares  estimates  of  these  parameters  turn  out  to  be. 


and. 


b  = 


SX,}' 


a  =  Y-bX, 

Sx,Xf, 


n 


(Xx,)^  ’ 


n 


Y(Y.-a-bXy 
6^  =  MSE  =  ^  ‘ 


n-2 


(5-3) 


(5-4) 


(5-5) 


(Neter,  Wasserman,  and  Kutner:  1990,  pgs  50, 145).  We  know  from  basic  statistics  that,  in  this 
case. 


E(a)  =  a. 

(5-6a) 

E(b)  =  p, 

(5-6b) 

and 

E(MSE)  =  a^. 

(5-6c) 

Furthermore,  in  most  statistics  texts,  the  predicted  value  of  Y  at  X  =  Xh  is  generally  denoted  as 
=  a  +  bXh.  Under  the  assumptions  we  have  made,  we  can  say  that 
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E{Y,)  =  E{a+hX,). 


(5-7) 


and,  since  we  know  (as  stated  above  in  equations  5-6)  that  a,  b,  and  MSE  are  unbiased  estimators 
of  a,  p,  and  respectively,  we  can  go  a  step  further  to  say 

E(yj  =  £(a  +  pXJ.  (5-8) 

All  of  this  having  been  said,  we  might  be  tempted,  for  simplicity,  to  generate  simulated  values  of 
Y  at  X  =  Xh,  via  the  algorithm: 

1.  Generate  Z  ~  N(0, 1). 

2.  Set  Y  =  F,  +  (MSE)‘'2z  (5-9a) 


or,  equivalently. 


Y  =  a  +  bXh  +  (MSE)*'^  Z.  (5-9b) 

This  method  yields  a  distribution  of  Y’s  which  has  a  constant  variance,  as  we  would  expect,  and 
is  very  similar  to  the  strategy  implemented  in  the  recent  Factory  Simulation  Model  of  the  C-17 
assembly  process.  The  factory  simulation  model  is  consistent  with  the  model  we  postulate  above 
in  that  it  has  the  property  that  the  variance  of  the  simulated  Y’s  is  constant  and  not  dependent  on 
Xh.  Unfortunately,  because  it  is  implicitly  based  on  the  assumption  that 


Y  ~  n{y^,MSe) 


(5- 10a) 


or,  equivalently. 


(y-%) 

{MSE)''^ 


N(0,1) . 


(5- 10b) 


we  find  that  our  postulated  model  cannot  be  strictly  correct.  We  say  this  because  according  to 
Neter,  Wasserman  &  Kutner  (1990),  for  example,  that 


# 
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VariY,)=a^ 


(5-11) 


1,  ix,-xr 

n  J^{X,-X)\ 


and  thus  that  the  quantity 


♦ 


Y-Z 


Umse 


1+1+, 


n  J^(X,-Xf 


(5-12) 


(which  is  proportional  to  the  quantity  on  the  left  hand  side  of  equation  5- 10b)  does  not  have  a 
N(0,1)  distribution  as  stated  above.  Instead  of  an  N(0,1)  distribution,  it  has  a  Student’s  -t 
distribution  with  n-2  degrees  of  freedom.  Based  on  this  realization,  we  can  propose  another 
strategy  which  might  be  more  accurate. 

Method  2 


The  next  possible  strategy  is  given  by  the  following  algorithm. 
1.  Generate  T  ~  t(n-2). 


2. 


Set  Y=Y,+. 


MSE 


,  1  (x,-xy 


n 


'Z(x,-Xf 


(5-13) 


A  potential  weakness  of  this  strategy  is  that  the  variance  of  the  simulated  Y’s  is  not 
constant  but,  as  can  be  seen  from  inspection  of  equation  (5-13),  depends  on  Xh.  In  fact,  the 
variance  of  the  simulated  Y’s  increases  as  the  distance  between  X^  and  X  increases. 
Although  this  strategy  correctly  accounts  for  the  uncertainty  within  our  estimates  of  the 
parameters  a  and  (3,  it  is  somewhat  undesirable  since  the  behavior  is  not  consistent  with  that 
which  would  be  expected  from  the  real  system*.  This  would  be  especially  unappealing  in  the 


In  a  real  system,  we  would  expect  the  variance  to  remain  constant. 
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1 


time-dependent  case  where  we  successively  generate  values  of  Y  corresponding  to  values  of  X 
ranging  from  X  =  n+1  to  n+k.  Finally,  we  are  led  to  a  third  method  which  attempts  to  deal  with 
this  shortcoming. 

Method  3 


The  third  method  takes  into  consideration  the  fact  that,  since  a  and  b  are  normally 
distributed,  the  standardized  statistics  (b-P)/s(b)  and  (a-a)/s(a),  where  s(a)  and  s(b)  are  estimates 
of  the  standard  deviation  of  a  and  b  respectively,  are  distributed  as  t  with  n-2  degrees  of  freedom. 
Since  the  estimates  of  the  standard  deviations,  s(a)  and  s(b),  are  given  by 


MSE\ 


--I- 


[n 


(5-14) 


(Neter,  Wasserman  &  Kutner:  1990,  pg  71),  the  standardized  statistics  (b-p)/s(b)  and  (a-a)/s(a) 
become 


a-a 


MSE 


1 

--I-- 


n  Y(X,-X)^ 


(5- 15a) 


MSE 

S(x  -W 


(5- 15b) 


and  are  distributed  as  t(n-2). 

This  in  mind,  a  possible  strategy  for  generating  simulated  values  of  Y  for  X=Xh  is  as 
follows; 

1.  Generate  Ti  and  T2  ~  t(n-2) 
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2.  Set  A  =  a  + 

4.  Generate  Z  ~  N(0,1) 

5.  Set  Y  =  A  +  BXh  +  (MSE)‘^^Z.  (5-16c) 

If  a  sequence  of  values  starting  at  X  =  n+1  is  desired,  one  could  repeat  steps  3  and  4  for  each 
successive  sequence  of  variables.  Although  the  resulting  values  of  Y  will,  in  the  aggregate,  have 
a  variance  that  increases  as  the  distance  between  Xh  and  X  increases,  if  we  apply  steps  3  and  4 
iteratively  for  given  values  of  A  and  B,  the  individual  sequences  generated  will  behave  like 
observations  from  a  Unear  model  with  a  constant  variance. 

With  all  of  this  theory  behind  us,  we  next  demonstrate  each  of  the  three  methods  using, 
first,  a  linear  model,  and  second,  a  log-linear  model. 

5-3.  Linear  Case  Studies 

To  simplify  the  development  process,  we  start  with  modeling  and  simulating  a  simple 
linear  relationship.  Once  the  methods  outlined  above  are  demonstrated  using  this  example,  we 
go  on,  in  the  next  section,  to  apply  them  to  the  Log-Linear  Learning  Curve  equation. 

Generating  20  Simulated  Observations:  Here  we  assume  we  know  the  values  of  the  parameters 
a  (intercept),  p  (slope),  and  (fixed  variance).  Using  EXCEL,  we  start  out  by  simulating  a  set 
20  observations  calculated  from  the  following  equation. 

Y  =  a  +  pX+  a[N(0,l)]  (5-17) 

We’ve  added  the  aN(0,l)  term  to  stochasticize  the  simulated  observations! 
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Once  the  data  is  simulated,  we  use  the  method  of  ordinary  least  squares  to  fit  a  linear 
regression  model  to  the  20  simulated  observations;  this,  of  course,  yields  estimated  values  for  a, 
p,  and  (a,  b,  and  MSB,  respectively).  So,  we’ve  made  a  model  of  a  model.  These  fitted 
values  are  used  throughout  the  following  three  methods  as  the  basis  in  simulating  future  values 
of  the  dependent  variable. 

Method  1: 

Once  the  theoretical  line  and  confidence  interval  are  plotted,  we  simulate  and  plot  five 
future  sequences  of  values  Yh  for  X  =  Xh  for  n  =  21  through  n  =  120  on  the  same  set  of  axes.  To 
calculate  these  sequences,  we  use  equation  (5-9b).  This  done,  we  repeat  starting  with  three  more 
sets  of  20  initial  observations  and  recalculate  the  plot  three  times  to  show  the  random  nature  of 
the  simulated  data  and  the  uncertainty  in  fit  of  the  predicted  line.  These  plots  are  shown  below, 
in  Figure  5-1  for  illustration. 

Note,  in  the  plots,  how  the  random  nature  of  the  simulated  data  has  a  significant  effect  on 
the  the  fitted  regression  line.  The  fitted  line  does  not  always  accurately  model  the  true 
relationship;  a  small  error  in  the  estimate  of  slope  can  seriously  skew  the  fit  away  from  the 
simulated  data  in  the  long  run.  As  might  possibly  have  been  uttered  before,  perhaps  we 
shouldn  ’t  treat  the  fitted  regression  as  if  it  provided  a  perfect  forecast  of  the  expected  value  in  the 
future. 

We  also  note  that  the  variance  of  each  individual  simulated  sequence  is,  by  construction, 
constant  over  time.  This  is  as  we  would  expect  from  a  real  life  system. 
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Figure  5-1  Method  1  for  Equation  of  Line  —  Four  Simulations 
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We  must  keep  in  mind  that  the  calculations  done  to  generate  the  preceding  simulated 
sequences  were  not  entirely  correct  since  they  were  built  upon  the  assumption,  as  given  in 
equation  (5- 10b),  that 


(Y-t) 


~N(0,1) 


when  in  actuality,  it  is  proportional  to  the  student’ s-t  distribution.  Method  2  will  work  the 
student’s-t  distribution  into  the  calculation  of  the  Yh’s. 


Method  2 


In  a  manner  analogous  to  the  techniques  used  in  Method  1,  we  again  generate  20 
simulated  observations,  use  them  to  fit  a  curve,  and  then  predict  five  series  of  Yh  for  X  =  21  to 
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X  =  120.  To  calculate  these  new  sequences,  we  use  equation  (5-13).  The  only  difference  here  is, 
as  outlined  in  Section  2,  Y  is  estimated  as  (Yhat)h  plus  a  standard  deviation  times  a  student’s  t 
distribution.  The  standard  deviation  term  is  dependent  on  Xh  and  grows  in  magnitude  as  we  go 
forward  in  time;  see  Figure  5-2.  Once  again  we  can  observe  the  uncertainty  in  fit  of  the 
regression  line  and  the  five  simulated  sequences;  their  slopes  vary  from  plot  to  plot.  This  could 
be  the  case  in  subsequent  experiments  since  the  simulated  sequences  are  built  upon  the  fitted 
regression  line. 


Figure  5-2  Method  2  for  Equation  of  Line  —  Four  Simulations 
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The  variance,  unfortunately,  is  not  constant  as  we  expect  to  find  in  the  real  life  system.  To 


compensate  for  this  disparity  between  the  simulated  sequences  and  what  we  expect  in  reality,  we 
go  on  to  Method  3  which  provides  us  with  simulated  sequences  which  each  appear  to  have 


relatively  constant  variance  through  the  range  of  Xh’s. 
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Method  3 


Following  the  same  technique  used  for  Methods  1  and  2,  above,  we  can,  once  again, 
generate  20  simulated  observations,  use  OLS  regression  to  fit  a  line,  and  then  predict  five  series 
of  Yh  for  X  =  21  to  X=120  for  comparison  to  the  theoretical  line.  To  calculate  these  sequences. 


we  use  equations  (5- 16a  and  5- 16b).  The  resulting  plots  are  shown  in  Figure  5-3. 


Figure  5-3  Method  3  for  Equation  of  Line  —  Four  Simulations 
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Note  how  each  of  the  five  simulated  series  of  Yh’s,  for  Xh  =  21  through  Xh  =  120,  unlike  the 
series  produced  in  Methods  1  and  2  above,  all  go  off  in  slightly  different  directions  while  at  the 
same  time  maintaining  a  constant  variance  within  each  individual  series.  What  this  helps  us  to 
see  is  that  we  can  take  into  account  the  uncertainty  of  the  fit  of  the  regression  line  by  simulating 
not  just  one  but  multiple  series  of  future  values  for  Yh-  In  this  way,  we  build  a  confidence 
interval  of  sorts  (made  up  of  the  plots  of  the  individual  sequences)  which  helps  us  to  keep  in 
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mind  the  fact  that  the  regression  line  and  its  confidence  intervals  do  not  necessarily  provide  a 
perfect  fit  for  what  the  actual  data  might  be  expected  to  do!  In  the  C-17  Factory  Simulation, 
little  or  no  effort  was  made  to  account  for  this  uncertainty. 

In  the  next  section  (Section  4),  we  simulate  the  first  twenty  observations  of  the  learning 
curve  the  same  way  we  simulate  them  for  the  equation  of  the  line  above.  As  before,  we  also  go 
on  to  fit  this  simulated  data  (building  a  model  of  a  model)  using  a  log  transformation  and  OLS 
regression.  The  goal  of  the  next  section  is  to  investigate  the  nature  of  the  uncertainty  in  the  Log- 
Linear  Learning  curve  forecasts  and  to  determine  if  we  even  need  to  account  for  it  in  the 
construction  of  our  simulated  sequences  of  Yh’s.  Because  of  the  rescaling  produced  by  the  log 
transformations,  the  discrepancy  in  fit  which  we  see  in  the  linear  case  may  not  be  apparent  in  the 
case  of  the  log-linear  learning  curve. 

5-4.  Log-Linear  Learning  Curve  Case  Studies 

Although  we  discovered  with  the  linear  cases  above  that  the  uncertainty  in  fit  could 
produce  significant  inaccuracy,  we  may  not  discover  that  the  uncertainty  for  the  log-linear  case 
is  worth  accounting  for.  Since  this  is  difficult  to  know  without  investigation,  we  shall  now  go 
on  to  see  what  effect  Methods  1,  2,  and  3  have  on  the  associated  plots  for  the  learning  curve. 
Method  1 

In  a  manner  analogous  to  the  techniques  used  in  Method  1  (Section  5-3),  we  again 
generate  20  simulated  observations;  this  time  we  use  the  log-linear  learning  curve  equation 
Y  =  aX^b  +  error.  We  then  fit  a  curve  to  these  observations,  and  predict  five  series  of  Yh  for  X  = 
21  to  X  =  120.  To  calculate  these  new  sequences,  we  use  equation  (5-9b).  Again,  we 
recalculate  the  simulated  observations,  the  regression  curve,  and  the  simulated  sequences  of  Yh’s. 
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These  are  all  shown  in  Figure  5-4.  We  note,  from  the  four  simulations  in  Figure  5-4,  that  the  fit 
of  the  regression  line  varies  almost  indiscemably  from  plot  to  plot. 


Figure  5-4  Method  1  for  Log-Linear  Equation  In  Log  Space  —  Four  Simulations 


In  addition  to  the  plots  in  log  space  shown  in  Figure  5-4,  we  also  generated  plots  in  linear 


space  by  plotting  hours  vs.unit  number  instead  of  plotting  the  natural  log  of  hours  versus  the 


natural  log  of  the  unit  number  as  we  did  for  the  plots  in  log  space;  the  plots  in  linear  space  are 


shown  in  Figure  5-5.  We  can  make  the  same  observations  about  these  plots  that  we  made 
regarding  those  in  Figure  5-4;  the  fit  of  the  regression  curve  varies  almost  indiscemably  from  plot 
to  plot.  Since  Method  1  is  very  similar  to  the  method  used  in  the  C-17  FSM,  we  start  to  think, 
based  on  these  plots,  that  ignoring  the  uncertainty  in  fit  might  not  have  the  high  price-tag  which 
we  originally  expected.  Despite  these  early  indications  that  we  might  not  need  to  account  for 
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uncertainty  (at  least  not  in  the  case  of  the  log-linear  learning  curve),  we’ll  look  at  the  effects  that 


Methods  2  and  3  have  on  the  fitted  regression  curves. 


Figure  5-5  Method  1  for  Log-Linear  Equation  In  Linear  Space  —  Four  Simulations 
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Method  2 

Once  again,  we  generate  four  sets  of  simulated  observations,  four  fitted  regression  lines, 
and  four  sets  of  five  sequences  of  simulated  future  observations  (Yh’s).  This  time,  as  in  Method 

2,  Section  5-3,  we  use  equation  (5-13)  which  corrects  the  incorrect  assumption  regarding  the 
distribution  of  the  error  which  is  actually  distributed  as  t(n-2)  not  N(0,1).  We  plot  the  resulting 
data  both  in  log  space  and  in  linear  space;  these  plots  are  shown  in  Figures  5-6  and  5-7 
respectively.  In  Figure  5-6,  we  see  the  same  increasing  variance  evident  in  Method  2,  Section  5- 

3.  Once  again,  however,  the  fitted  regression  curve  seems  not  to  vary  noticiably  from  simulation 
to  simulation. 
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Figure  5-6  Method  2  for  Log-Linear  Equation  In  Log  Space  —  Four  Simulations 
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Figure  5-7  Method  2  for  Log-Linear  Equation  In  Linear  Space—  Four  Simulations 
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Method  3 


One  last  time,  we  generate  the  20  simulated  observations,  the  fitted  regressions,  and  the 
simulated  sequences  of  future  observations.  This  time  when  we  generate  the  simulated 
sequences,  we  use  equations  (5- 16a  through  5- 16c).  The  result  is  a  series  of  simulated  sequences 


each  of  which  seem  to  have  a  constant  variance.  Once  again,  we’ve  gone  on  to  plot  the  resulting 
data  in  both  log  and  linear  space;  these  plots  are  shown  below  in  Figures  5-8  and  5-9. 

Figure  5-8  Method  3  for  Log-Linear  Equation  In  Log  Space  -  Four  Simulations 
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Figure  5-9  Method  3  for  Log-Linear  Equation  In  Linear  Space  —  Four  Simulations 
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5-5.  Conclusions 

From  the  plots  in  Figures  5-4,  through  5-9,  we  can  say  that  perhaps  the  C-17  FSM  was 
not  seriously  injured  by  the  fact  that  during  its  development,  uncertainty  in  fit  was  ignored. 
Method  1,  which  is  similar  to  the  methods  used  in  the  C-17  FSM,  produced  reasonable  fits,  not 
only  in  Log-Space,  but  in  linear  space  as  well.  Method  2  and  Method  3  seemed  to  make  no 
appreciable  amount  of  inprovement  in  the  fits. 
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6.  Conclusions 


The  results  of  this  thesis  are  somewhat  inconclusive  since  the  study  does  not 
identify  any  one  model  as  always  being  the  best  fitting  or  forecastmg  tool  for  all  sets  of 
data.  It  does,  however,  provide  an  important  general  indication  that  ARMA  models, 
particiilarly  the  AR  models,  are  a  promising  alternative  to  the  standard  log-linear  learning 
curve  approach  \^llich  is  widely  in  use  today;  they  are  comparatively  simple  to  use, 
intuitive  in  nature  and  seem  to  provide  a  good  forecast  based  on  a  small  amount  of  data. 

The  investigation  of  metamodels  and  the  question  of  accounting  for  the  uncertainty 
itt  fit  within  the  C- 17  Factory  Simulation  Model  yielded  some  encouraging,  as  well  as 
potentially  useful,  results.  In  the  case  of  the  single  linear  model,  we  found  that  the 
apphcation  of  our  proposed  Methods  2,  and  especially,  3  did  a  good  job  of  accounting  for 
uncertainty  within  the  fit  of  the  regression  line;  the  resulting  simulated  sequences  seemed 
to  cluster  well  around  the  theoretical  confidence  interval.  In  the  case  of  the  log-linear 
learning  ciirve,  however,  the  results  were  not  as  striking.  In  particular,  we  foimd  that  our 
Method  1,  which  is  very  similar  to  the  methods  used  in  the  C- 17  FSM,  did  a  reasonable 
job  of  simulating  sequences  of  (future)  observations  which,  for  the  most  part,  fell  within 
the  theoretical  prediction  interval.  Apphcation  of  Methods  2  and  3  did  not  noticeably 
mq)rove  the  fidehty  of  the  simulated  sequences.  The  bottom  line,  here,  is  that  ignoring  the 
uncertainty  in  fit  (as  the  C- 17  FSM  does)  doesn’t  seem  to  carry  the  high  cost  we 
expected! 

There  are  a  few  things  which,  based  on  hindsight,  I  might  have  done  difiPerently. 
First  of  all,  in  Sections  3-6  and  4-2, 1  would  have  used  a  log-linear  learning  curve  data  set 
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^\ilich  had  no  error  in  it.  (The  model  we  used  had  a  uniformly  distributed  error  term  with 
a  very  small  variance.)  Using  data  which  had  no  error  term  may  have  provided  a  more 
useful  initial  analysis  since  the  models  would  have  had  a  ‘clean’  data  set  for  the  initial  test 
instead  of  having  to  deal  with  the  random  component  injected  by  the  error  term.  Of 
course,  the  error  we  used  in  the  model  was  quite  small  and  may  have  had  a  negligible 
effect  anyway. 

Another  thing  I  might  have  done  differently  was  to  use  all  of  the  models  to 
fit/forecast  the  log-linear  learning  curve  data  the  same  way  we  used  them  to  fit/forecast 
the  F-102  data  and  the  Notional  C-17  data;  this  would  have  provided  a  more  conq)lete 
picture  of  the  abihties  of  the  candidate  models.  As  it  was,  in  Section  4-2,  we  used  the 
models  only  to  fit  the  log-linear  data;  in  Section  4-3,  we  used  the  models  to  fit  the  F-102 
data  and  then  went  on  and  used  them  to  fit/forecast  the  F-102  data;  in  Section  4-4,  we 
used  the  models  only  to  fit/forecast  the  Notional  C-17  data.  Each  section  should  have 
used  precisely  the  same  set  of  investigations. 

Yet  another  modification  I  might  make  to  my  investigations  would  be  to  have 
picked  a  sarople  size  for  the  fit/forecasting  investigations  which  was  common  to  all  data 
sets.  For  exan:q)le,  since  the  Notional  C-17  data  fit/forecasting  investigations  used  a 
sample  of  15  observations  on  which  to  base  the  fitting  of  the  models,  we  should  have  used 
this  same  number  in  the  fit/forecasting  of  the  F-102  data;  furthermore,  if  our  investigations 
had  included  fit/forecasting  for  the  log-linear  data,  this  number  should  also  have  been  the 
same  for  them  Using  different  numbers  of  observations  in  the  fit  part  of  the 


6-2 


fit/forecasting,  put  the  candidate  models  on  uneven  ground  and  make  it  difficult  to 
evaluate  the  results  of  the  study. 

One  more  thing  I  would  do  if  there  was  more  time,  would  be  to  take  a  closer  look 
at  the  SSE’s  (used  as  measures  of  performance  in  Chapter  4).  It  would  be  interesting  to 
see  how  close  in  magnitude  each  of  the  resulting  SSE’s  are.  Perhaps  we’d  find  that  there 
is  virtually  no  diBference  between  the  fideUty  of  the  top  four  models;  we  might  find  that 
any  one  of  them  will  do  an  equally  good  job  at  forecasting  so  we  could  choose  the  lowest 
order  or  simplest  model  fi:om  among  them  in  pursuit  of  parsimony  or  sinphcity.  On  the 
other  hand,  we  might  find  that  the  model  ranked  number  one  had  a  statistic  which  was  six 
orders  of  magnitude  better  than  the  next  best  model’s  statistic;  this  might  keep  us  fi:om 
choosing  a  lower  order  model.  I  feel  that  the  relative  magnitudes  of  the  SSE’s  is  quite  an 
important  consideration  in  the  analysis  of  the  results. 

The  point  of  this  thesis  was  not  to  show  that  log-linear  learning  curve  models  are 
no  not  good  at  forecasting  learning  type  data,  that  ARMA  models  were  the  best  models 
for  this  type  of  forecasting,  or  that  the  C-17  FSM  should  be  scrapped  because  it  failed  to 
account  for  imcertainty  in  fit.  Instead,  the  objective  was  to  show  that  there  are  other 
models  which  provide  a  viable  alternative  to  the  standard  learning  curve.  The  secondary 
emphasis  was  to  investigate  the  merit/cost  of  ignoring,  within  the  C-17  FSM,  the 
imcertainty  in  fit.  I  beheve  I’ve  accompUshed  both  of  these  things!  Hopefidly,  model 
developers  can  utilize  some  of  the  information  in  this  thesis  to  assist  them  in  developing 
the  best  forecasting  models  possible. 
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Appendix  A 

Forecast-Pro  Summary  and  Worksheets 
(Work  done  in  Forecast-Pro.) 


Table  A-1  Forecast-Pro,  Summary  of  Statistics  and  Parameters 
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Forecast  Pro  for  Windows  Standard  Edition  Version  2.00 
Mon  Jan  29  19:55:18  1996 


Expert  data  exploration  of  dependent  variable  PWRMEANS 


Length  50  Minimum  283.069  Maximum  1002.019 
Mean  402.219  Standard  deviation  141.123 

Series  too  short  to  determine  seasonality.  Treating  as  nonseasonal. 

Classical  decomposition  (nonseasonal) 

Trend-cycle:  97.11%  Irregular:  2.89% 

Log  transform  recommended  for  Box- Jenkins. 

There  are  no  strongly  significant  regressors,  so  I  will  choose 
a  univariate  method. 

Exponential  smoothing  outperforms  Box-Jenkins  by  2.588  to  3.628 
out-of-sample  (MAD).  I  tried  21  forecasts  up  to  a  maximum  horizon  6. 

For  Box-Jenkins,  I  used  a  log  transform. 

Series  is  trended  and  nonseasonal. 

Recommended  model:  Exponential  Smoothing 

Figure  A-1  Forecast-Pro,  Exponential  Smoothing 


Simple  exponential  smoothing 


Forecast  Model  for  PWRMEANS 

Simple  exponential  smoothing:  No  trend,  No  seasonality 

Confidence  limits  proportional  to  level 

Smoothing  Final 
Component  Weight  Value 


Level  1.00000  286.22 

Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  .9954 
Durbin-Watson  0.9442 
Forecast  error  35.43 
MAPE  0.02687 
MAD  14.53 


Number  of  parameters  1 
Standard  deviation  142.6 
Adjusted  R-square  0.9382 

**  Ljung-Box(18)=42.45  P=0.999 
BIC  36.47 
RMSE  35.07 


Figure  A-2  Forecast-Pro,  Simple  Exponential  Smoothing 
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Naive  (SMA(l),  Random  walk) 


Forecast  Model  for  PWRMEANS 
Automatic  model  selection 
Naive  (SMA(l),  Random  walk) 

Standard  Diagnostics 


Sample  size  49 
Mean  390 
R-square  0.9022 
Durbin-Watson  0.2607 
Forecast  error  35.43 
MAPE  0.02742 
MAD  14.83 


Number  of  parameters  0 
Standard  deviation  114.4 
Adjusted  R-square  0.9042 

**  Ljung-Box(18)=41,51  P=0.9987 
BIC  35.43 
RMSE  35.43 


Figure  A-3  Forecast-Pro,  Simple  Moving  Average  (SMA(l)) 
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Holt  exponential  smoothing 


Forecast  Model  for  PWRMEANS 
Automatic  model  selection 

Holt  exponential  smoothing:  Linear  trend,  No  seasonality 
Confidence  limits  proportional  to  level 

Smoothing  Final 
Component  Weight  Value 


Level  0.99973  286.22 

Trend  0.23092  -1.3716 

Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.9611 
Durbin-Watson  1.938 
Forecast  error  28.39 
MAPE  0.02575 
MAD  13.83 


Number  of  parameters  2 
Standard  deviation  142.6 
Adjusted  R-square  0.9603 

Ljung-Box(18)=6.991  P=0.009796 
BIC  30.08  (Best  so  far) 

RMSE  27.82 

Figure  A-4  Forecast-Pro,  Holt  Exponential  Smoothing 
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Forecast  Model  for  PWRMEANS 
ARIMA(1,0,0) 


Term  Coefficient  Std.  Error  t-Statistic  Significance 

a[l]  0.9953  0.0056  178.7528  1.0000 

CONST  1.9008 


Forecast  Model  for  PWRMEANS 
ARIMA(2,0,0) 


Term  Coefficient  Std.  Error  t-Statistic  Significance 


a[l]  1.7947  0.0479  37.4544  1.0000 

a[2]  -0.8018  0.0494  -16.2365  1.0000 

^CONST  2.8753 

Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.9841 
Durbin-Watson  2.04 
Forecast  error  18.16 
MAPE  0.01623 
MAD  8.227 


Number  of  parameters  2 
Standard  deviation  142.6 
Adjusted  R-square  0.9838 
Ljung-Box(18)=5.466  P=0.002073 
BIC  19.24  (Best  so  far) 

RMSE  17.79 


Figure  A-6  Forecast-Pro,  AR(2) 
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Forecast  Model  for  PWRMEANS 
ARIMA(3,0,0) 


Term 

Coefficient  Std.  Error  t-Statistic  Significance 

a[l] 

1.5845 

0.0448 

35.3651 

1.0000 

a[2] 

-0.2266 

0.0986 

-2.2984 

0.9740 

a[3] 

-0.3632 

0.0556 

-6.5293 

1.0000 

CONST 


2.1476 


Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.9833 
Durbin-Watson  1.686 
Forecast  error  18.83 
MAPE  0.01657 
MAD  8.382 


Number  of  parameters  3 
Standard  deviation  142.6 
Adjusted  R-square  0.9826 

Ljung-Box(18)=3.235  P=4.927e-005 
BIC  20.53 
RMSE  18.25 


Figure  A-7  Forecast-Pro,  AR(3) 
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Forecast  Model  for  PWRMEANS 
ARIMA(4,0,0) 


Term  Coefficient  Std.  Error  t-Statistic  Significance 


a[l] 

1.6203 

0.0414 

39.1669 

1.0000 

a[2] 

-0.2361 

0.1250 

-1.8883 

0.9347 

a[3] 

-0.4572 

0.1958 

-2.3350 

0.9760 

a[4] 

0.0676 

0.0995 

0.6791 

0.4995 

^CONST  2.1717 

Try  alternative  model  ARIMA(3,0,0) 


Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.9844 
Durbin-Watson  1.733 
Forecast  error  18.4 
MAPE  0.01603 
MAD  8.023 


Number  of  parameters  4 
Standard  deviation  142.6 
Adjusted  R-square  0.9833 

Ljung-Box(18)=3.237  P=4.951e-005 
BIC  20.64 
RMSE  17.65 


Figure  A-8  Forecast-Pro,  AR(4) 


1996 


Forecast  Model  for  PWRMEANS 
ARIMA(0,0,1) 


Term  Coefficient  Std.  Error  t-Statistic  Significance 


b[l]  -0.9675  0.0200  -48.4850  1.0000 

^CONST  402.2187 

Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.8037 
Durbin-Watson  0.05782 
Forecast  error  63. 16 
MAPE  0.1155 
MAD  47.34 


Number  of  parameters  1 
Standard  deviation  142.6 
Adjusted  R-square  0.8037 

**Ljung-Box(18)=158.7P=l 
BIC  65.02 
RMSE  62.52 


Figure  A-9  Forecast-Pro,  MA(1) 
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Forecast  Model  for  PWRMEANS 
ARIMA(0,0,2) 


Coefficient  Std.  Error  t-Statistic  Significance 


b[l]  -1.3321  0.0563  -23.6572  1.0000 

b[2]  -0.9397  0.0413  -22.7603  1.0000 

_CONST  402.2187 

Standard  Diagnostics 

Sample  size  50 

Number  of  parameters  2 

Mean  402.2 

Standard  deviation  142.6 

R-square  0.9254 

Adjusted  R-square  0,9239 

Durbin-Watson  0.5625 

**  Ljung-Box(18)=121.6  P=1 

Forecast  error  39.33 

BIC  41.67 

MAPE  0.071 17 

RMSE  38.53 

MAD  28.74 

Figure  A- 10  Forecast-Pro,  MA(2) 

Forecast  Model  for  PWRMEANS 
ARIMA(0,0,3) 


Term 

Coefficient  Std.  Error  t-Statistic  Significance 

b[l] 

-1.5788 

0.0562 

-28.0915 

1.0000 

b[2] 

-1.5280 

0.0661 

-23.1201 

1.0000 

b[3] 

-0.8981 

0.0410 

-21.8820 

1.0000 

.CONST  402.2187 
Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.9723 
Durbin-Watson  0.8157 
Forecast  error  24.23 
MAPE  0.04995 
MAD  19.54 


Number  of  parameters  3 
Standard  deviation  142.6 
Adjusted  R-square  0.971 1 

**Ljung-Box(18)=128.3  P=1 
BIC  26.42 
RMSE  23.49 


Figure  A-1 1  Forecast-Pro,  MA(3) 
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Forecast  Model  for  PWRMEANS 
ARIMA(0,0,4) 


Term  Coefficient  Std.  Error  t-Statistic  Significance 


b[l] 

-1.7923 

0.0546 

-32.7994 

1.0000 

b[2] 

-2.2399 

0.0883 

-25.3794 

1.0000 

b[3] 

-1.6646 

0.0802 

-20.7439 

1.0000 

b[4] 

-0.8566 

0.0452 

-18.9600 

1.0000 

.CONST 

402.2187 

Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.9872 
Durbin-Watson  1.09 
Forecast  error  16.66 
MAPE  0.03225 
MAD  12.63 


Number  of  parameters  4 
Standard  deviation  142.6 
Adjusted  R-square  0.9863 
**  Ljung-Box(18)=107.4  P=1 
BIC  18.69  (Best  so  far) 

RMSE  15.98 


Figure  A- 12  Forecast-Pro,  MA(4) 
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Forecast  Model  for  PWRMEANS 
ARIMA(1,0,1) 


Term  Coefficient  Std.  Error  t-Statistic  Significance 


a[l]  0.9954  0.0061  163.9297  1.0000 

b[l]  -0.8783  0.0604  -14.5466  1.0000 

_CONST  1.8609 

Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.9646 
Durbin-Watson  0.8359 
Forecast  error  27.09 
MAPE  0.02662 
MAD  13.28 


Number  of  parameters  2 
Standard  deviation  142.6 
Adjusted  R-square  0.9639 

Ljung-Box(18)=20.31  P=0.6841 
BIC  28.7 
RMSE  26.54 


Figure  A- 13  Forecast-Pro,  ARMA(1,1) 
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Forecast  Model  for  PWRMEANS 
ARIMA(2,0,1) 


Term 

Coefficient  Std.  Error  t-Statistic  Significance 

a[l] 

1.8974 

0.0387 

49.0106 

1.0000 

a[2] 

-0.9048 

0.0403 

-22.4327 

1.0000 

b[l] 

-0.2490 

0.1061 

-2.3476 

0.9768 

_CONST  2.9941 
Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.9865 
Durbin-Watson  2.348 
Forecast  error  16.92 
MAPE  0.0181 
MAD  8.177 


Number  of  parameters  3 
Standard  deviation  142.6 
Adjusted  R-square  0.9859 

Ljung-Box(18)=6.226  P=0.004808 
BIC  18.45  (Best  so  far) 

RMSE  16.4 


Figure  A-14  Forecast-Pro,  ARMA(2,1) 
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Forecast  Model  for  PWRMEANS 
ARIMA(1,0,2) 


Term 

Coefficient  Std.  Error  t-Statistic  Significance 

a[l] 

0.9991 

0.0041 

241.5930 

1.0000 

b[l] 

-1.1743 

0.0591 

-19.8619 

1.0000 

b[2] 

-0.9597 

0.0334 

-28.7005 

1.0000 

_CONST  0.3623 
Standard  Diagnostics 


Sample  size  50 
Mean  402.2 
R-square  0.9908 
Durbin-Watson  1.73 
Forecast  error  13.97 
MAPE  0.01687 
MAD  8.051 


Number  of  parameters  3 
Standard  deviation  142.6 
Adjusted  R-square  0.9904 
Ljung-Box(18)=28.68  P=0.9476 
BIC  15.23  (Best  so  far) 

RMSE  13.54 


Figure  A- 15  Forecast-Pro,  ARM A(  1,2) 
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Forecast  Model  for  PWRMEANS 
ARIMA(2,0,2) 


Term  Coefficient  Std.  Error  t-Statistic  Significance 


a[l] 

1.8956 

0.0462 

41.0309 

1.0000 

a[2] 

-0.9072 

0.0468 

-19.3879 

1.0000 

b[l] 

-0.4297 

0.1050 

-4.0940 

0.9998 

b[2] 

-0.5006 

0.1039 

-4.8165 

1.0000 

.CONST  4.6625 

Standard  Diagnostics 

Sample  size  50 
Mean  402.2 
R-square  0.9893 
Durbin-Watson  2.207 
Forecast  error  15.23 
MAPE  0.01859 
MAD  8.357 

Figure  A- 16  Forecast-Pro,  ARMA(2,2) 


Number  of  parameters  4 
Standard  deviation  142.6 
Adjusted  R-square  0.9886 

Ljung-Box(18)=14.53  P=0.3059 
BIC  17.08  (Best  so  far) 

RMSE  14.61 


1995  1996 
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Appendix  B 


Fitting  the  Simulated  Log-Linear  Learning  Curve  Data  Using  50 

Observations 
(Work  done  in  EXCEL) 


Table  B-1  Detailed  Summary  of  Log-Linear  Fitting 


Log-Lmear  Model 


Parameters:  a 

I  1004333179 


SSE:  763,686 


Equation:  Y(x)=a*x'^b 


Forsythe 


Y(x)=a*x'^b+cmin 


Slanford-B 


Parameters: 


SSE: 


Equation: 


=  121731483 


1358Ef04 


Y  (x)=a*(x+beta)  '^n 


Parameters: 

a* 

b* 

cmin 

,  929.17375 

-0.402443468 

100 

SSE: 

2J56E4-03 

1  0 

0 

Parameters: 

alpha 

a 

beta 

665J7197 

^  0.977241524 

0.022758477 

2427E4-0S  1 


MC(x)=aIpha*a (x-l)+beta 


S-Curve 


Parameters: 

a 

L* 

b* 

k* 

I03i25768;ffi 

0.15066219;  v-:;:* 

SSE:  4324E4-04  0 


Equation:  Y  (x)=L*exp(-b*exp(-k*t)) 


Parameters:  phi*  const* 


const* 


,03891?  7035227 


9.686E4^03 


Equation:  AR(1)=  phi*(ARl-l)+const+noise 


Parameters: 

phi_l* 

phL2* 

const* 

jiBiilpilil 

i0;07484:^? 

56.68531 

SSE:  ipii;851EW"T 


Equation:  AR(2)=  phil*(AR2-l)+phi2*(AR2-2)+const+noise 
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(more)  Table  B-1  EXCEL,  Detailed  Summary  of  Log>Linear  Fitting 


liiiiiiiiiiiiiii! 

iiiislii! 

i  ill  1 

Parameters: 

phi_l* 

lllgliiill 

phi_2* 

liUilii 

iiiii 

phi_3*  cc 

iliiiilliii 

inst* 

illiif  1 

|SSE:  k  t7S«E+03  | 

Equation:  AR(3)=  phil*(AR3-l)+phl2*(AR3-2)+phi3*(AR3.3)+ 

const+noise 

Parameters: 

phLl* 

phi_2* 

phi_3* 

IBIiBiilM 

phi_4* 

const* 

■|:^;37ii:l . . 

s  l,439E+03 

sd _ _ _ 

Equation: 

AR(4)=phil*(AR4-l)+phl2*(AR4-2)+phi3*(AR4-3)+ 

phi4*(AR4-4)+const+noise 

iiiiihkijjti:.;.:::: 

iiiill|li::liiiliiiiil:iiyiiiii:iiii:ii|iiii|iiiiiiiil;iil| 

Parameters: 

myou* 

455.64949 

theta* 

0.471978149 

SSE: 

2^6«SE+06 

. . i . 

Equation: 

MA(l)=myou-theta*errorminusl+noise 

Parameters: 

muprime* 

theta* 

. ■■ 

phi* 

0.74083 

SSE: 

1  _ 

ARMA(l,l)=phi*(ARMA(l,l)-l)+inuprime-theta*errorlast+noise 

Parameters: 

muprime* 

thetal* 

theta2* 

-U.OlyOD 

SSE: 

SilWISBS 

1 

Equation: 

ARMA(l,2)=phil*(ARMA(l,2)-l)+muprime- 

thetal*errorlast-theta2*errorlastlast+noise 

AMMA(24tft 

liiilliiiiliiiiiiiiiili 

Parameters: 

muprime* 

IlIeMlHr 

phil* 

phi2* 

isillilii* 

theta* 

■liiiiiiliiiii 

SSE: 

■lliiiiiEMi 

1 . . . . 

_ 

Equation: 

ARMA(2,1)  = 

phil*(ARMA(2,l)-l)+phi2*(ARMA(2,l)-2)+ 

muprime-theta*errorlast+noise 

ARMilip); 

•  •.  ;:3  -;  • 

Parameters: 

muprime* 

phil* 

phi2* 

0.07045 

thetal* 

theta2* 

^0.31920 

SSE: 

[  4,4t«E+03 

ill  _ 

Equation: 

ARMA(2,2)=phil*(ARMA(2,2)-l)+phi2*(ARMA(2,2)-2)+muprime- 
thetal*errorlast-  theta2*errorlas 
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Table  B-2  Brief  Summary  of  Log-Linear  Fitting 


■— d  ■ 

SSE 

Rank 

Log-Unear 

1 

■  Fonsythie 

11576.1 

Pegel 

242678.9 

S-Curve 

43241.7 

MA(1) 

2665105.0 

14 

ARd) 

9685.7 

10 

AR(2) 

4851.1 

7 

AR(3) 

1756.4 

3 

AR(4) 

1438.8 

2 

ARMACLD 

5856.5 

8 

ARMAd^) 

6274.5 

9 

ARMA(2,1) 

3714.9 

5 

ARMA(2^) 

4468.4 

6 
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Parameters: 

a 

lAA/l  Q:a’11'7Q 

b* 

SSE: 

100,000 

Equation: 

Y(x)=a*x''b 

Figure  B-1  EXCEL,  Log-Linear  Fit  to  Log-Linear  Data 


Parameters: 

a* 

b* 

cmin 

929.17B7539 

-0x402443468 

100 

SSE: 

Equation: 

Y{x)=a*x^b+cmin 

Figure  B-2  EXCEL,  Forsythe  Fit  to  Log-Linear  Data 


Figure  B-5  EXCEL,  S-Curve  Fit  to  Log-Linear  Data 


AR(3)  Fit  to  Log>Linear  Data 


Parameters: 


phLI*  phL2*  phi_3*  phL4*  const* 

0.037070476  0.046871928  0.01883124  42.3711 

SSE:  I  t439E+03 

Equation:  AR(4)=  phil  *(AR4-1  )+phi2*(AR4-2)+phi3*(AR4-3Hphi4*(AR4-4)+const+noise _ 


Figure  B-9  EXCEL,  AR(4)  Fit  to  Log-Linear  Data 


Parameters: 

myou*  theta* 

mmm  04720 

SSE: 

zmmm 

Equation: 

IVIA(1  )=:myou-theta*errormlnus1  +nolse 

Figure  B-10  EXCEL,  MA(1)  Fit  to  Log-Liner  Data 


ARMA(1,2)  Fit  to  Log-Linear  Data 


Parameters: 

muprime*  phi1*  phi2* 

theta* 

63,7918  0.7645  0.041791709 

-0-705S 

SSE: 

a715E+03 

Equation: 

ARMA(2,1  )=phi1  *(ARMA{2,1  )-1  )+phi2*(ARMA{2,1  )-2)+ 
muprime-theta*errorlast+noise 

Figure  B-13  EXCEL,  ARMA(2,1)  Fit  to  Log-Linear  Data 

ARMA(2,1)  Fit  to  Log-Linear  Data 


25  30 

t  Number 


Parameters:  muprime*  phi1*  phi2*  thetal* 

•  S;:-  'rH  l0.ti^67463  ' 

SSE:  4468E+03 

Equation:  ARMA(2,2)=phir(ARMA(2,2)-1Hphi2*(ARMA(2,2)-2)+muprime- 

theta1*error!ast-theta2*errorlastlast+noise 


6  mm 


thetal* 

0^75674^' 


Figure  B-14  EXCEL,  ARMA(2,2)  Fit  to  Log-Linear  Data 

ARMA(2,2)  Fit  to  Log-Linear  Data 


theta2* 

-0.319203866 


Table  B-3  EXCEL,  Simulated  Log-Linear  Data  Base 


The  Simulated  Log-Linear  Data  Base 

I  Unit  Number 

Log>Lin(stoch) 

1  Unit  Number 

Log*Lin(stoch)  | 

1 

1010.832269 

26 

347.6588376 

2 

808.3325458 

27 

346.8230994 

3 

691.2839896 

28 

340.5004826 

4 

641.1901161 

29 

336.3884482 

5 

595.4460444 

30 

335.0655174 

6 

557.2059351 

31 

329.6040636 

7 

527.0011364 

32 

332.1842007 

8 

508.4621019 

33 

325.4725805 

9 

496.9922034 

34 

320.8691874 

10 

469.848955 

35 

316.8387424 

11 

462.2715407 

36 

315.0144544 

12 

454.8076222 

37 

316.1912111 

13 

442.1848743 

38 

311.1881426 

14 

432.8762252 

39 

311.0490991 

15 

414.1127291 

40 

304.7384949 

16 

403.5124097 

41 

304.5376237 

17 

396.6960025 

42 

300.691177 

18 

391.530033 

43 

297.5358953 

19 

383.1869894 

44 

296.2756757 

20 

379.2670047 

45 

292.0449126 

21 

377.8492397 

46 

289.0213147 

22 

372.9927929 

47 

286.4419215 

23 

366.6170179 

48 

285.358444 

24 

364.765332 

49 

284.2516737 

25 

352.3002239 

50 

287.7423919 
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Appendix  C 


Fitting  the  Historical  F-102  Data  Using  500  Observations 
(Work  done  in  EXCEL) 


Table  C-1  Detailed  Summary  of  Fitting  the  Historical  F-102  Data  Base 


4FirttogitlieM02Data. 


Forsythe 


Parameters: 


SSE: 


Equation: 


838249.88 


5.503E+10 


Y(x)=a*x'^b+ciiiin 


b* 


-0.5144 


Staiifbrd^B 


Parameters: 


SSE: 


Equation: 


beta* 


45456836.6  29.5607 


4.540E+10 


|Y(x)=a*(x+beta)'^n 


-1.4187 


Parameters: 

1  \ 

const* 

0.91522 

4629.393 

SSE: 


Equation: 


2.722E-H10 


|AR(1)=  phi*(ARl-l)+const+noise 


AR(2) 


Parameters: 


SSE: 


Equation: 


phLl*  I  phi_2*  I _ const* 


0.9017230  -0.0214  11982.597 


4.930E+09 


|AR(2)=  phil*(AR2.1)+phi2*(AR2-2)+const+noise 


AR(3) 


Parameters: 


SSE: 


Equation: 


phi_3*  I  const* 


phi_l*  I  phL2* 


0.84346  0.08390  -0.00332  4094.4724 


2.637E+10 


AR(3)=  phil*(AR3-l)+phi2*(AR3-2)+phi3*(AR3-3)+const+noise 


Parameters: 

phLl*  1 

phL2*  1 

phi_3* 

1  phi_4* 

const* 

0.89845 

-0.00194 

-0.02330 

-0.28369 

12934.3652 

SSE:  I  4.877E+09 


Equation:  |AR(4)=  phil*(AR4-l)+phi2*(AR4-2)+phi3*(AR4-3)+phi4*(AR4-4)-Hconst+noise 


ARMAd,!) 


Parameters: 


SSE: 


Equation: 


theta* 


0.52344 


* 


0.93489 


3426.4751 


2.089E+10 


ARMA(14)=phi*(ARMA(l,l)"l)+nmprime"theta*errorlast+noise 


(more)  Table  C>1  Detailed  Summary  of  Fitting  the  Historical  F-102  Data  Base 


Parameters: 

muprime'^  |  phil*  |  thetal''^  |  thetal* 

3468.5690  0.93417  0.56678  -0.06678 

SSE: 

2.079E+10  1 

Equation: 

ARMA(l,2)=phil*(ARMA(l^)-l)+muprime-thetal*errorlast- 

theta2’'‘errorIastlast+noise 

Parameters: 

3400.5396  0.92509  0.01020  0.52183 

SSE: 

2.087E+10  1 

Equation; 

ARMA(2,l)=phil*(ARMA(2,l).l)+phi2*(ARMA(2,l).2)+ 

muprime-theta*errorlast+noise 

Parameters: 

muprime’^  |  phil*  |  phi2*  |  thetal*  |  theta2* 

3477.1898  0.93663  -0.00260  0.56949  -0.07003 

SSE: 

2.079E+10  1 

Equation: 

ARMA(2^)=phil*(ARMA(2^)-l)+phi2*(ARMA(2,2)-2)+muprime-thetal* 

errorIast-theta2*errorlastlast+noise 

Table  C-2  Brief  Summary  of  Fitting  the  Historical  F-102  Data  Base 


Model 

SSE  : 

Rank 

.  Log-Unear 

5.503E+10 

10 

SWsythe 

5.503E+10 

10 

Stanford-B 

4.540E+10 

9 

. 'im 

2.722E+10 

8 

AR(2) 

4.930E+09 

2 

\  ARCS) 

2.637E+10 

7 

AR(4) 

4.877E+09 

1 

ARMAd,!) 

2.089E+10 

6 

ARMAd^) 

2,079E+10 

4 

AR»tA(2,l) 

2.087E+10 

5 

ARMA(2;2) 

2.079E+10 

3 

C-2 


Parameters: 

a* 

b* 

cmin* 

ft 

SSE: 

<>oo<J4y,ooo 

,  5.503EHhi0 

-Ur,oi4oyi:ut  1 

Equation: 

Y(x)=a*x'^b+cmln 

Figure  C-2  EXCEL,  Forsythe  Fit  to  F-102  Historical  Data 

Forsythe/F-102  Fit 
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Total  Hours 


Parameters: 

a*  beta* 

n* 

4^6836>^  29.56069417 

•1,41871(»19 

SSE: 

4J40E4^10 

Equation: 

Y(x)=:a*(x+beta)'^n 

Figure  C-3  EXCEL,  Stanford-B  Fit  to  F-102  Historical  Data 


Stanford-B/F-102  Fit 


350000 

300000 

250000 

(0  200000 
3 

=  150000 
100000 


Parameters:  phi*  const* 

0.915219997  '  4629.392506' 

ssE:  ^ 

Equation: _ ARl(x)=  phi*(Ysubx4)4-const+error 

Figure  C-4  EXCEL,  AR(1)  Fit  to  F-102  Historical  Data 


AR{1)/F-102  Fit 


1  17  33  49  65  81  97  113  129  145  161  177  193  209  225  241  257  273  289  305  321  337  353  369  385  401  417  433  449  465  481  497 

Unit  Number 


Total  Hours 


Parameters:  phi_l*  phL2*  const* 

0S75Z9  -0.00905  1731132351 

SSE:  m7E+10 

Equation:  AR2  =  phLl*(Ysubx-l)+  phi_2*(Ysubx-2)+const+error 


Figure  C-5  EXCEL,  AR(2)  Fit  to  F-102  Historical  Data 


1  18  35  52  69  86  103  120  137  154  171  188  205  222  239  256  273  290  307  324  341  358  375  392  409  426  443  460  477  494 

Unit  Number 


Parameters:  phi_l*  phi_2*  phi_3*  const* 

0.«77n4341  *0.095597136  0.1608S5068  3096.15527 

SSE:  4.035Eh*10 

Equation: _ AR3  =  phi_l*(Ysubx-l)+  phi_2*(Ysubx-2)-t-phL3*(Ysubx-2)+const+error _ 


Figure  C-6  EXCEL,  AR(3)  Fit  to  F-102  Historical  Data 


AR(3)/F-102  Fit 


C-5 


Total  Hours 


Parameters: 

phi_l*  phi_2*  phi_3*  phi_4* 

const* 

SSE: 

0**595i-i354  '“tX2'S3o932 

2.349B+tO 

9123,75263 

Equation: 

AR4  =  phi_l*(Ysubx-l)+  phL2*(Ysubx-2)+phi_3*(Ysubx-2)+const+error 

Figure  C-7  EXCEL,  AR(4)  Fit  to  F-102  Historical  Data 


Hour 


Parameters: 

muprime*  phil* 

thetal*  theta2* 

3m36m  ,  ,  0.9342 

0.5668  ^,06678403 

SSE: 

,  2.079E^10 

Equation: 

ARMA(l^)=phil*(ARMA(l,2)-l)+muprime-thetal*errorIast-  1 

theta2*errorlastlast+noise 

_ 1 

Figure  C-9  EXCEL,  ARMA(1,2)  Fit  to  F-102  Historical  Data 


ARMA(1,2VF-102  Fit 


PLN  Number 


Parameters: 

muprime*  phil*  phi2* 

theta* 

. . •;  3 

III,.  0.5218-  ■: 

SSE: 

^  2*087E^IO 

Equation: 

ARMA(2,l)=phil*(ARMA(2,l)-l)+phi2*(ARMA(2,l)-2)+ 

muprime-theta*errorIast+noise 

Figure  C-10  EXCEL,  ARMA(2,1)  Fit  to  F-102  Historical  Data 


Parameters: 


muprime*  thetal*  theta2* 

un.ms  0^366  -0,0026  0^56^M8|^--;:^^(K0700323S?; 

ssE: 

Equation:  ARMA(2^)=phil*(ARMA(2^)-l)+phi2*(ARMA(2,2)-2)+ 

_ muprime-thetal*errorlast-theta2*errorlastlast+noise 

Figure  C-11  EXCEL,  ARMA(2,2)  Fit  to  F-102  Historical  Data 

ARMA(2,2}/F-102  Fit 

350000  -I . . . . . . . 


300000 


- F-102 

- ARMA(2.2) 


Appendix  D 


Forecasting  the  Historical  F-102  Data  Using  20  Observations  and 
Hold-out  Sample  of  480  Observations 
(Work  done  in  EXCEL) 


Table  D-1  Detailed  Summary  of  Forecasting  the  Historical  F-102  Data  Base 


tbi  Data  Based  on  a  20  Unit  IBs 


Forsythe 


Parameters: 


SSE(lst20): 


SSE  (all  500): 


Equation: 


Sianfdrd'B  ' 


Parameters 


SSE(l$t20): 


SSE  (all  500) 


Equation: 


ISK(l) 


Parameters: 


SSE  (1st  20): 


SSE  (all  500): 


Equation: 


3772289.66 


2.965E+09 


6.204E+11 


Y(x)=a*x*b+cmin 


1753527.5 


2.814E+09 


1.560E+10 


Y(x)=a*(x+beta)'^n 


b* 


-1.0169 


const* 


phi* 


0.88739  11333.898 


2.452E+09 


1.075E+12 


AR(1)=  phi*(ARl-l)+const+noise 


wmmsksmm 


Parameters: 


SSE  (1st  20): 


SSE  (all  500): 


Equation: 


phi_l*  I  phi_2*  I  const* 


0.9032423  -0.0341  14658.230 


2.350E+09 


9.413E+09 


AR(2)=  phil*(AR2-l)+phi2*(AR2-2)-H:onst+noise 


Parameters: 

phLl* 

I  phL2*  I 

phL3* 

I  const* 

0.89068 

-0.00943 

-0.03295 

18459.07235 

SSE  (1st  20): 


mmm 


Equation: 


AR(4) 


Parameters: 


SSE  (1st  20): 


SSE  (all  500): 


Equation: 


2.257E+09 


2.160E+12 


AR(3)=phil*(AR3-l)+phi2*(AR3-2)+phi3*(AR3-3)+ 

const+noise 


phi_l*  I  phL2*  I  phi_3* 


0.85381  -0.01134  0.02524 


1.707E+09 


4.063E+10 


AR(4)=  phil*(AR4.1)+phi2*(AR4-2)+phi3*(AR4-3)+ 
phi4*(AR4-4)+const+noise 


phi_4* 


-0.08194 


const* 


29783.44523 


D-1 


theta* 

phi 

11141.4558 

0.17746 

0.88825  1 

2.414E+09 


1.036E+12 


ARMA(14)=phi*(ARMA(l,l)“l)+inuprime-theta*errorlast+noise 


phil* 


10202.6021  0.90340  1.04307  0.67739 


thetal* 


1.04307 


theta2* 


0.67739 


1.153E+09 


1.294E+12 


ARMA(l^)=phil*(ARMA(l,2)-l)+muprime-thetal*errorIast- 

theta2*errorlastlast+noise 


phil* 


26833.4934  0.91517  -0.10895  1.72503 


phi2* 


-0.10895 


theta* 


1.72503 


8.848E+08 


3.320E+12 


ARMA(2,l)=phil*(ARMA(2,l)-l)+phi2*(ARMA(2,l)-2)+muprime- 

theta*errorlast+noise 


phil* 


15389.0325  0.93634  -0.07749 _ 1.29884 


phi2* 


-0.07749 


thetal* 


1.29884 


SSE  (all  500): 


Equation: 


8.793E+08 


1.458E+12 


ARMA(2^)=phil*(ARMA(2,2)-l)+phl2*(ARMA(2,2)-2)+muprinie- 
thetal*errorlast-  theta2*errorlastlast+noise 


(more)  Table  D*1  Detailed  Summary  of  Forecasting  the  Historical  F>102  Data  Base 


SSE  (1st  20): 


SSE  (all  500) 


Equation: 


mmmmwm- 


Parameters: 


SSE  (1st  20): 


SSE  (all  500) 


Equation: 


ARMA(2,1) 


Parameters: 


SSE  (1st  20): 


SSE  (all  500): 


Equation: 


A»MA(2;2) 


Parameters: 


theta2* 


0.46942 


Table  D-2  Brief  Summary  of  Forecasting  the  Historical  F-102  Data  Base 


SilBUiill 

SSE  (1st  20) 

SSE  Oast  480) 

SSE  (all  500) 

Rank  (1st  20) 

Rank  (last  480) 

Lag-Lin^r 

2.965E+09 

6.175E+11 

6.204E+11 

10 

4 

;  ;  Fqrsytiie’: 

2.965E+09 

6.175E+11 

6,204E+11 

10 

4 

'  staOford-B 

2.814E+09 

1.278E+10 

1.560E+10 

9 

2 

AR(1) 

2.452E+09 

1.073E+12 

1.075E+12 

8 

6 

..v.-;;AR(2) 

2.350E+09 

7.064E+09 

9.413E+09 

6 

1 

AR(3) 

2.257E+09 

2.158E+12 

2.160E+12 

5 

9 

AR(4) 

1.707E+09 

3.892E+10 

4.063E+10 

4 

3 

ARMA(1,1) 

2.414E+09 

1.034E+12 

1.036E+12 

7 

5 

ARMAd^Z) 

1.153E+09 

1.293E+12 

1.294E+12 

3 

7 

;arma(2,i) 

8.848E+08 

3.319E+12 

3.320E+12 

2 

10 

8.793E+08 

1.457E+12 

1.458E+12 

1 

8 

D-2 


Hours 


r 


Parameters: 

a* 

beta* 

n* 

•  -a 

S35666S5::;;v 

SSE(lst20): 

2Mmm 

SSE(all  500): 

Equation: 

Y(x)=a*U+beta)'^n 

Figure  D-3  EXCEL,  Stanford-B  Forecast  of  F-102  Historical  Data 


Stanford-B  Forecast  of  F-102  (using  20  unit  history) 


- F-102  Data 

- Forecast  (start  at  unit  31) 


Parameters: 

phi*  const* 

.  0.887387686  •'  11333.89809'' 

SSE(lst20): 

2,452B+09 

SSE(all500): 

l.075E4‘i2  . 

Equation: 

ARl  =  phi*(Ysubx-l)+const+error 

Figure  D-4  EXCEL,  AR(1)  Forecast  of  F-102  Historical  Data 


AR{1)  Forecast  of  F-102  (using  20  unit  history) 


D-4 


200 


Parameters; 

phi_l*  phi_2*  const* 

0,903242252  ,  =  4),034058S6  '  14658.230J3 

SSE(lst20): 

2,3508+09 

SSE(all500): 

9,413E409  • 

Equation: 

AR(2)  =  phi_l*(Ysubx-l)+  phi_2*(Ysubx-2)+const+error 

Figure  D-5  EXCEL,  AR(2)  Forecast  of  F-102  Historical  Data 

AR(2)  Forecast  of  F-102  (using  20  unit  history) 


-F-102 

-  Forecast  (starts  at  unit  26) 


Parameters: 


phLl*  R*'*-^*  R***-^*  R***-^*  const* 

SSE(lst20):  IJ07B+09 

SSE(aII500):  4063E^^10 

Equation: _ AR(4)  =  phi_l*(Ysubx-l)+  phi_2*(Ysubx-2)+phi_3*(Ysubx-2)-fconst+error 


Figure  D-7  EXCEL,  AR(4)  Forecast  of  F-102  Historical  Data 


AR(4)  Forecast  of  F-102  (from  20  unit  history) 


Parameters: 

muprime*  theta*  phi* 

\  11141.4558  0.1775  0.8883 

SSE  (1st  25): 

'2.414B409 

SSE  (all  500): 

L036B^12 

Equation: 

ARMA(1,1)  =  phi*(ARMA(l,l)-l)+muprime-theta*errorlast+noise 

Figure  D-8  EXCEL,  AR(1,1)  Forecast  of  F-102  Historical  Data 


ARMA(1,1)  Forecast  of  F-102  (using  20  unit  history) 


197 


Hours 


Parameters: 

muprime* 

phil*  thetal*  fheta2* 

oomA  1  til’ll  a 

SSE  (1st  25): 

U5mm 

SSE(all  500): 

Equation: 

ARMA(l^)  =  phil*(ARMA(l,2)-l)+muprime-thetal*errorlast-  I 

theta2*errorlastlast+noise  | 

Figure  D-9  EXCEL,  AR(1,2)  Forecast  of  F-102  Historical  Data 


ARMA(1,2)  Forecast  of  F-102  (from  20  unit  history) 


Parameters: 

muprime*  phil*  phi2* 

theta* 

SSE  (1st  25): 

2o833*4934  0.9152  -C/.iOo95lUoo 

1./25U 

SSE  (all  500): 

3320E+12 

Equation: 

ARMA(2,1)  =  phil*(ARMA(2,l)-l)+phi2*(ARMA(2,l)-2)+ 
muprime-theta*errorlast+noise 

Figure  D-10  EXCEL,  AR(2,1)  Forecast  of  F-102  Historical  Data 


ARMA(2,1)  Forecast  of  F-102  (from  20  unit  history) 


D-7 
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Parameters: 

muprime* 

phil* 

phi2* 

thetal* 

theta2* 

I5m0325 

0.9363 

-0.0773 

L298842272 

0.469419732 

SSE  (1st  20): 

.  ,  8,793E+0K 

SSE  (all  500): 

Equation:  ARMA(2;2)  =  phil*(ARMA(2,2)-l)+phi2*(ARMA(2,2)-2)+ 

muprime-thetal*errorlasNtheta2*errorlastlast4-noise 

Figure  D-11  EXCEL,  AR(2,2)  Forecast  of  F-102  Historical  Data 


ARMA(2,2)  Forecast  of  F-102  (from  20  unit  history) 


250000 


200000 


Appendix  E 


The  Historical  F-102  Data  Base  -  500  Observations 


Table  E-1  The  Historical  F-102  Data 


Is'  ■  '  '  " 

F-lOs^Pata  Base 

OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

1 

1 

1 

402475 

1 

5942 

1 

2 

2 

2 

375849 

1 

5942 

3 

3 

3 

3 

278963 

2 

5942 

7 

4 

4 

4 

271223 

2 

5942 

7 

5 

5 

5 

262498 

2 

5942 

8 

6 

6 

6 

258078 

2 

5942 

9 

7 

7 

7 

243726 

2 

5942 

10 

8 

8 

8 

232766 

2 

5942 

10 

9 

9 

9 

220833 

2 

5942 

11 

10 

10 

10 

218827 

2 

5942 

12 

11 

11 

11 

322447 

3 

5942 

15 

12 

12 

12 

306736 

3 

5942 

16 

13 

13 

13 

290470 

3 

5942 

17 

14 

14 

14 

282951 

3 

5942 

18 

15 

15 

15 

233125 

4 

5942 

21 

16 

16 

16 

215379 

4 

5942 

22 

17 

17 

17 

203122 

4 

5942 

22 

18 

18 

21 

189770 

4 

5942 

24 

19 

19 

18 

164120 

4 

5942 

23 

20 

20 

19 

169080 

4 

5942 

23 

21 

22 

20 

150387 

4 

5942 

23 

22 

23 

22 

154606 

4 

5942 

24 

23 

24 

23 

168896 

4 

5942 

27 

24 

25 

27 

159485 

4 

5942 

27 

25 

26 

25 

149109 

4 

5942 

27 

26 

27 

32 

129792 

5 

5942 

28 

27 

28 

24 

128958 

5 

5942 

27 

28 

29 

31 

130389 

5 

5942 

28 

29 

31 

26 

1 14872 

4 

239P3 

26 

30 

32 

36 

128470 

5 

5942 

28 

31 

33 

35 

121932 

5 

5942 

28 

32 

34 

29 

117969 

5 

5942 

28 

33 

35 

33 

1 17364 

5 

5942 

28 

34 

36 

37 

116895 

5 

5942 

29 

35 

37 

30 

94174 

5 

23903 

27 

36 

38 

40 

115466 

5 

5942 

29 

37 

39 

28 

111137 

5 

5942 

27 

38 

40 

47 

108890 

5 

5942 

30 

39 

41 

49 

113487 

5 

5942 

30 

40 

43 

55 

115846 

5 

5942 

30 

41 

44 

39 

111029 

5 

5942 

29 

42 

45 

44 

107397 

5 

5942 

29 

43 

21 

38 

164751 

4 

5942 

29 

44 

30 

46 

119597 

5 

5942 

29 

45 

47 

52 

100520 

6 

23903 

31 

46 

48 

50 

100504 

6 

23903 

31 

E-1 


Table  E-1  The  Historical  F-102  Data 


F402DataBa$^  f 


OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

A1 

49 

34 

98316 

6 

23903 

30 

48 

51 

57 

99317 

6 

23903 

32 

49 

52 

41 

98162 

6 

23903 

30 

50 

53 

42 

94614 

6 

23903 

30 

51 

55 

43 

91628 

6 

23903 

30 

52 

56 

45 

88098 

6 

23903 

30 

53 

58 

48 

90164 

6 

23903 

30 

54 

60 

51 

86840 

6 

23903 

31 

55 

61 

53 

87197 

6 

23903 

31 

56 

63 

54 

93812 

6 

23903 

32 

57 

65 

56 

97725 

6 

23903 

32 

58 

66 

59 

92401 

7 

23903 

33 

59 

67 

58 

86481 

7 

23903 

33 

60 

69 

60 

86830 

7 

23903 

33 

61 

70 

61 

84477 

7 

23903 

33 

62 

72 

70 

88589 

7 

23903 

34 

63 

62 

75 

73200 

9 

23903 

34 

64 

73 

62 

81738 

7 

23903 

33 

65 

74 

109 

90762 

7 

23903 

35 

66 

75 

64 

81865 

7 

23903 

34 

67 

77 

65 

79839 

7 

23903 

34 

68 

78 

66 

79358 

7 

23903 

34 

69 

79 

67 

79651 

7 

23903 

34 

70 

80 

68 

77606 

7 

23903 

34 

71 

82 

69 

75097 

7 

23903 

34 

72 

46 

86 

103137 

6 

23903 

35 

73 

64 

81 

77297 

9 

23903 

34 

74 

83 

71 

75510 

7 

23903 

34 

75 

84 

145 

83796 

7 

23903 

35 

76 

85 

72 

82662 

7 

23903 

35 

77 

86 

73 

85787 

7 

23903 

35 

78 

88 

146 

87891 

7 

23903 

35 

79 

89 

74 

78264 

7 

23903 

35 

80 

90 

76 

76967 

7 

23903 

35 

81 

91 

77 

74963 

7 

23903 

35 

82 

92 

78 

74628 

7 

23903 

35 

83 

93 

79 

76608 

7 

23903 

35 

84 

94 

80 

73689 

7 

23903 

35 

85 

96 

82 

79580 

8 

29264 

35 

86 

97 

84 

77425 

8 

29264 

36 

87 

42 

63 

137990 

6 

23903 

33 

88 

98 

83 

77409 

8 

29264 

35 

89 

99 

88 

75854 

8 

29264 

36 

90 

100 

85 

78187 

8 

29264 

36 

91 

101 

87 

76044 

8 

29264 

36 

92 

102 

89 

75017 

8 

29264 

36 

E-2 


Table  E-1  The  Historical  F-102  Data 


F.102  Data  B 

\  "  As.  *  ....  > 

OBS 

PLN  DelaySeq  TOTHRS 

Lot  Contract#  DM 

93 

91 

73337 

8 

29264 

36 

94 

104 

94 

72091 

8 

29264 

36 

95 

105 

92 

72916 

8 

29264 

36 

96 

107 

93 

69718 

8 

29264 

36 

97 

108 

106 

71085 

8 

29264 

37 

98 

109 

95 

68818 

8 

29264 

36 

99 

no 

96 

72137 

8 

29264 

37 

100 

111 

97 

72101 

8 

29264 

37 

101 

112 

99 

70917 

8 

29264 

37 

102 

113 

98 

71333 

8 

29264 

37 

103 

115 

100 

72966 

8 

29264 

37 

104 

116 

101 

69869 

8 

29264 

37 

105 

117 

103 

69895 

8 

29264 

37 

106 

118 

104 

70448 

8 

29264 

37 

107 

119 

105 

71490 

8 

29264 

37 

108 

120 

108 

69560 

8 

29264 

37 

109 

132 

120 

65290 

9 

29264 

37 

no 

133 

121 

64415 

9 

29264 

37 

111 

134 

122 

64269 

9 

29264 

37 

112 

135 

123 

65396 

9 

29264 

37 

113 

136 

124 

65398 

9 

29264 

37 

114 

137 

123 

65747 

9 

29264 

37 

115 

138 

127 

63262 

9 

29264 

38 

116 

140 

126 

65332 

9 

29264 

38 

117 

141 

128 

61274 

9 

29264 

38 

118 

142 

129 

64319 

9 

29264 

38 

119 

143 

130 

64793 

9 

29264 

38 

120 

144 

131 

61270 

9 

29264 

38 

121 

145 

134 

63046 

9 

29264 

38 

122 

146 

132 

65928 

9 

29264 

38 

123 

147 

135 

60844 

9 

29264 

39 

124 

148 

137 

64753 

9 

29264 

38 

125 

59 

133 

77679 

9 

23903 

38 

126 

68 

102 

86933 

9 

23903 

37 

127 

71 

90 

78669 

9 

23903 

36 

128 

121 

111 

72530 

9 

29264 

37 

129 

122 

112 

70642 

9 

29264 

37 

130 

123 

no 

69385 

9 

29264 

37 

131 

124 

113 

64465 

9 

29264 

37 

132 

125 

114 

66052 

9 

29264 

37 

133 

126 

115 

66023 

9 

29264 

37 

134 

128 

116 

65284 

9 

29264 

37 

135 

129 

117 

64232 

9 

29264 

37 

136 

130 

118 

64695 

9 

29264 

37 

137 

131 

119 

65503 

9 

29264 

37 

138 

150 

136 

63054 

9 

29264 

■  38 
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OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

139 

151 

138 

63048 

9 

29264 

38 

140 

152 

139 

64929 

9 

29264 

38 

141 

153 

740 

62522 

9 

29264 

38 

142 

154 

142 

60538 

9 

29264 

38 

143 

155 

141 

62861 

9 

29264 

38 

144 

156 

147 

61370 

9 

29264 

38 

145 

157 

148 

60747 

9 

29264 

39 

146 

158 

143 

60665 

9 

29264 

38 

147 

159 

149 

62824 

9 

29264 

38 

148 

161 

150 

58087 

9 

29264 

38 

149 

162 

152 

59473 

9 

29264 

38 

150 

163 

157 

63951 

9 

29264 

38 

151 

164 

153 

59320 

9 

29264 

38 

152 

165 

156 

60055 

9 

29264 

38 

153 

166 

154 

61220 

9 

29264 

38 

154 

167 

155 

62458 

9 

29264 

38 

155 

168 

157 

62412 

9 

29264 

38 

156 

170 

158 

61843 

9 

29264 

38 

157 

171 

159 

63077 

9 

29264 

38 

158 

172 

160 

62071 

9 

29264 

38 

159 

173 

161 

61858 

10 

29264 

38 

160 

174 

162 

60979 

10 

29264 

38 

161 

175 

163 

58349 

10 

29264 

38 

162 

176 

164 

60204 

10 

29264 

38 

163 

177 

167 

56691 

10 

29264 

38 

164 

178 

165 

60527 

10 

29264 

38 

165 

180 

166 

57210 

10 

29264 

38 

166 

181 

168 

59645 

10 

29264 

38 

167 

182 

170 

59433 

10 

29264 

38 

168 

50 

171 

102354 

6 

23903 

38 

169 

54 

107 

94057 

8 

23903 

37 

170 

57 

144 

84495 

8 

23903 

38 

171 

183 

169 

57653 

10 

29264 

39 

172 

184 

172 

57566 

10 

29264 

39 

173 

185 

173 

56691 

10 

29264 

39 

174 

186 

174 

56621 

10 

29264 

39 

175 

187 

176 

59791 

10 

29266 

39 

176 

188 

177 

56079 

10 

29264 

39 

177 

76 

181 

58506 

10 

23903 

39 

178 

190 

178 

59994 

10 

29264 

39 

179 

191 

179 

60757 

10 

29264 

39 

180 

192 

180 

54645 

10 

29264 

39 

181 

193 

182 

59312 

10 

29264 

39 

182 

194 

183 

59483 

10 

29264 

39 

183 

195 

184 

55581 

10 

29264 

39 

184 

196 

185 

57455 

10 

29264 

39 

E-4 


OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

185 

198 

175 

54241 

10 

29264 

39 

186 

199 

186 

59042 

10 

29264 

39 

187 

200 

187 

56001 

10 

29264 

39 

188 

201 

188 

53381 

10 

29264 

39 

189 

202 

189 

57139 

10 

29264 

39 

190 

203 

190 

55403 

10 

29264 

39 

191 

201 

191 

54271 

10 

29264 

39 

192 

81 

192 

54031 

10 

23903 

39 

193 

206 

193 

57332 

10 

29264 

39 

194 

207 

194 

53560 

10 

29264 

39 

195 

208 

193 

56839 

10 

29264 

39 

196 

209 

196 

55439 

10 

29264 

39 

197 

210 

197 

53354 

10 

29264 

39 

198 

211 

198 

57169 

10 

29264 

39 

199 

212 

199 

56625 

10 

29264 

39 

200 

214 

200 

53698 

10 

29264 

39 

201 

215 

201 

55117 

10 

29264 

39 

202 

216 

202 

64410 

11 

31174 

40 

203 

217 

203 

63457 

11 

31174 

40 

204 

218 

204 

64855 

11 

31174 

40 

205 

219 

203 

64709 

11 

31174 

40 

206 

220 

206 

61493 

11 

31174 

40 

207 

87 

207 

61633 

11 

23903 

40 

208 

222 

208 

64154 

11 

31174 

40 

209 

223 

209 

60020 

11 

31174 

40 

210 

224 

210 

61057 

11 

31174 

40 

211 

226 

211 

64077 

11 

31174 

40 

212 

227 

212 

62750 

11 

31174 

40 

213 

228 

213 

63673 

11 

31174 

40 

214 

229 

214 

61986 

11 

31174 

40 

215 

231 

215 

65932 

11 

31174 

40 

216 

232 

216 

64059 

11 

31174 

40 

217 

233 

217 

61146 

11 

31174 

40 

218 

234 

218 

63329 

11 

31174 

40 

219 

236 

219 

54223 

12 

31174 

40 

220 

237 

220 

57406 

12 

31174 

40 

221 

95 

221 

52714 

11 

23903 

40 

222 

238 

222 

57999 

12 

23903 

40 

223 

239 

223 

58051 

12 

31174 

40 

224 

241 

224 

47604 

12 

31174 

40 

225 

242 

225 

55653 

12 

31174 

40 

226 

243 

226 

53132 

12 

31174 

40 

227 

244 

227 

55019 

12 

31174 

40 

228 

246 

228 

51793 

12 

31174 

40 

229 

247 

229 

54301 

12 

31174 

40 

230 

248 

230 

51429 

12 

31174 

40 

E-5 


OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

231 

249 

231 

59606 

12 

31174 

46 

232 

251 

232 

51161 

12 

31174 

40 

233 

252 

233 

53072 

12 

31174 

40 

234 

253 

234 

56653 

12 

31174 

40 

235 

106 

235 

53057 

11 

23903 

41 

236 

254 

236 

55143 

12 

31171 

40 

237 

256 

237 

50899 

12 

31174 

40 

238 

257 

238 

52395 

12 

31174 

40 

239 

258 

239 

50183 

12 

31174 

40 

240 

259 

240 

52217 

12 

31174 

40 

241 

261 

241 

52656 

12 

31174 

40 

242 

262 

245 

52818 

12 

31174 

41 

243 

263 

242 

49485 

12 

31174 

40 

244 

265 

246 

52517 

12 

31174 

41 

245 

266 

251 

49671 

12 

31174 

41 

246 

267 

247 

54124 

12 

31174 

41 

247 

269 

248 

50204 

12 

31174 

41 

248 

270 

243 

52963 

12 

31174 

41 

249 

271 

249 

51224 

12 

31174 

41 

250 

114 

244 

50005 

11 

23903 

41 

251 

273 

250 

52793 

12 

31174 

41 

252 

274 

252 

49813 

12 

31174 

41 

253 

275 

253 

51312 

12 

31174 

41 

254 

277 

254 

49304 

12 

31174 

41 

255 

278 

257 

50879 

12 

31174 

41 

256 

279 

255 

48798 

12 

31174 

41 

257 

281 

256 

51624 

12 

31174 

41 

258 

282 

258 

47738 

12 

31174 

41 

259 

283 

259 

52435 

12 

31174 

41 

260 

285 

262 

48038 

12 

31174 

41 

261 

127 

263 

50935 

12 

23903 

42 

262 

286 

260 

49978 

12 

31174 

41 

263 

287 

261 

49255 

12 

31174 

41 

264 

289 

264 

49755 

12 

31174 

41 

265 

290 

271 

46508 

12 

31174 

41 

266 

291 

277 

49555 

12 

31174 

41 

267 

293 

265 

47469 

12 

31174 

41 

268 

294 

272 

50337 

12 

31174 

41 

269 

295 

266 

48865 

12 

31174 

41 

270 

139 

273 

51644 

12 

23903 

42 

271 

297 

267 

49463 

12 

31174 

41 

272 

298 

268 

47582 

12 

31174 

41 

273 

299 

269 

49975 

12 

31774 

41 

274 

301 

274 

46285 

12 

31174 

41 

275 

302 

275 

50195 

12 

31174 

41 

276 

303 

283 

47662 

12 

31174 

42 
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Table  E-1  The  Historical  F-102  Data 


Table  E-1  The  Historical  F-102  Data 


OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

323 

346 

321 

46362 

13 

31174 

43 

324 

347 

378 

49038 

13 

31174 

43 

325 

348 

325 

48512 

13 

31174 

43 

326 

349 

322 

48172 

13 

31174 

43 

327 

350 

323 

45123 

13 

31174 

43 

328 

197 

365 

44669 

14 

29264 

44 

329 

351 

326 

46952 

13 

31174 

43 

330 

352 

327 

45029 

13 

31174 

43 

331 

353 

331 

47745 

13 

31174 

43 

332 

354 

328 

43882 

13 

31174 

43 

333 

355 

329 

46630 

13 

31174 

43 

334 

356 

330 

46004 

13 

31174 

43 

335 

357 

332 

48265 

13 

31174 

43 

336 

358 

333 

45810 

13 

31174 

43 

337 

359 

336 

47808 

13 

31174 

43 

338 

360 

348 

45384 

13 

31174 

43 

339 

205 

384 

41615 

14 

29264 

45 

340 

361 

334 

47072 

13 

31174 

3 

341 

362 

337 

46680 

13 

31174 

43 

342 

363 

335 

45825 

13 

31174 

43 

343 

364 

338 

46737 

13 

31174 

43 

344 

365 

349 

44783 

13 

31174 

43 

345 

366 

339 

46061 

13 

31174 

43 

346 

367 

340 

46326 

13 

37174 

43 

347 

368 

347 

44932 

13 

31174 

43 

348 

369 

341 

46868 

13 

31174 

43 

349 

370 

342 

45799 

13 

31774 

43 

350 

213 

385 

40989 

14 

29264 

45 

351 

371 

343 

45658 

13 

31174 

43 

352 

372 

353 

44921 

13 

31174 

43 

353 

373 

350 

45909 

13 

31174 

43 

354 

374 

344 

45890 

13 

31174 

43 

355 

375 

345 

45209 

13 

31174 

43 

356 

376 

346 

46284 

13 

31174 

43 

357 

377 

351 

44480 

13 

31174 

43 

358 

378 

354 

46528 

13 

31174 

43 

359 

379 

355 

45290 

13 

31174 

43 

360 

380 

352 

46779 

13 

31174 

43 

361 

220 

366 

43022 

14 

29264 

44 

362 

381 

356 

45011 

13 

31174 

43 

363 

382 

357 

46331 

13 

31174 

43 

364 

383 

358 

45536 

13 

31174 

43 

365 

384 

397 

45241 

13 

31174 

44 

366 

385 

359 

44991 

13 

31174 

43 

367 

386 

360 

46435 

13 

31174 

43 

368 

387 

361 

44128 

13 

31174 

'43 
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Table  E-1  The  Historical  F-102  Data 


F-102  ]&ata  Base 


OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

369 

388 

367 

47348 

13 

31174 

44 

370 

389 

362 

43402 

13 

31174 

43 

371 

390 

368 

47047 

13 

31174 

44 

372 

391 

369 

44173 

13 

31174 

44 

373 

225 

410 

40271 

14 

29264 

46 

374 

392 

363 

46565 

13 

37174 

44 

375 

393 

370 

44759 

13 

31174 

44 

376 

394 

364 

45930 

13 

31174 

44 

377 

395 

378 

46869 

13 

37174 

44 

378 

396 

373 

46292 

13 

31174 

44 

379 

397 

380 

44844 

13 

31174 

44 

380 

398 

371 

44195 

13 

31174 

44 

381 

399 

372 

45243 

13 

31174 

44 

382 

400 

386 

45916 

13 

31174 

44 

383 

401 

375 

44838 

13 

31174 

44 

384 

402 

381 

44763 

13 

31174 

44 

385 

230 

421 

41948 

14 

29264 

45 

386 

403 

374 

44645 

13 

31174 

44 

387 

404 

376 

43948 

13 

31174 

44 

388 

405 

389 

43744 

13 

31174 

44 

389 

406 

382 

44580 

13 

31174 

44 

390 

407 

391 

46038 

13 

37174 

44 

391 

408 

387 

43652 

13 

31174 

44 

392 

409 

379 

42371 

13 

31174 

44 

393 

410 

383 

45607 

13 

31174 

44 

394 

411 

388 

44552 

13 

31174 

44 

395 

412 

392 

47014 

13 

31174 

44 

396 

413 

393 

44289 

13 

31174 

44 

397 

414 

394 

45401 

13 

31174 

44 

398 

235 

428 

39196 

14 

29264 

46 

399 

415 

390 

43368 

13 

31174 

44 

400 

416 

403 

46979 

14 

31174 

44 

401 

417 

398 

44974 

14 

31174 

44 

402 

418 

404 

46598 

14 

31174 

44 

403 

419 

395 

43842 

14 

31174 

44 

404 

420 

399 

46131 

14 

31174 

44 

405 

421 

396 

43247 

14 

31174 

44 

406 

422 

400 

44734 

14 

31174 

44 

407 

423 

401 

43501 

14 

31174 

44 

408 

424 

402 

46129 

14 

31174 

44 

409 

425 

405 

42443 

14 

31174 

44 

410 

240 

454 

34031 

14 

29264 

46 

411 

426 

417 

44328 

14 

31174 

45 

412 

427 

406 

44473 

14 

31174 

44 

413 

428 

407 

43418 

14 

31174 

44 

414 

429 

408 

43775 

14 

31174 

44 

E-9 


Table  E-1  The  Historical  F-102  Data 


OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

415 

430 

409 

42007 

14 

31174 

44 

416 

431 

411 

42537 

14 

31174 

44 

417 

432 

422 

43270 

14 

31174 

45 

418 

433 

418 

42650 

14 

31174 

45 

419 

434 

412 

43261 

14 

31174 

45 

420 

435 

413 

42486 

14 

31174 

45 

421 

436 

414 

42083 

14 

31174 

45 

422 

437 

419 

42678 

14 

31174 

45 

423 

245 

440 

34969 

14 

29264 

46 

424 

438 

425 

41673 

14 

31174 

45 

425 

439 

415 

41431 

14 

31174 

45 

426 

440 

420 

41347 

14 

31174 

45 

427 

441 

416 

43084 

14 

31174 

45 

428 

442 

429 

42843 

14 

31174 

45 

429 

443 

462 

44814 

14 

31174 

45 

430 

444 

441 

41706 

14 

31174 

45 

431 

445 

423 

41108 

14 

31174 

45 

432 

496 

426 

41091 

14 

31174 

45 

433 

447 

424 

42091 

14 

31174 

45 

434 

448 

430 

40599 

14 

31174 

45 

435 

250 

470 

33149 

14 

29261 

46 

436 

449 

437 

43760 

14 

31174 

45 

437 

450 

427 

42527 

14 

31174 

45 

438 

451 

438 

41942 

14 

31174 

45 

439 

452 

431 

41095 

14 

31174 

45 

440 

453 

432 

42314 

14 

31174 

43 

441 

454 

466 

40751 

14 

31174 

43 

442 

455 

434 

40744 

14 

31174 

45 

443 

456 

439 

39510 

14 

31174 

45 

444 

457 

433 

41948 

14 

31174 

45 

445 

458 

442 

39649 

14 

31174 

45 

446 

459 

435 

41034 

14 

31174 

45 

447 

460 

436 

38751 

14 

31174 

45 

448 

255 

475 

32806 

14 

29264 

46 

449 

461 

443 

41160 

14 

31174 

45 

450 

462 

444 

39329 

14 

31174 

45 

451 

463 

445 

40440 

14 

31174 

15 

452 

464 

446 

38930 

14 

31174 

45 

453 

465 

447 

39816 

14 

31174 

45 

454 

466 

455 

40240 

74 

31174 

45 

455 

467 

448 

39527 

14 

31174 

45 

456 

468 

449 

38800 

14 

31174 

45 

457 

469 

450 

40127 

14 

31174 

45 

458 

470 

456 

38137 

14 

31174 

45 

459 

471 

451 

40364 

14 

31174 

45 

460 

472 

559 

35847 

14 

31174 

48 
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Table  E-1  The  Historical  F-102  Data 


F-102  Data  Base 


OBS 

PLN 

DelaySeq 

TOTHRS 

Lot 

Contract# 

DM 

461 

265 

499 

34639 

■a 

462 

473 

452 

36716 

■a 

463 

474 

457 

38847 

14 

31174 

45 

464 

475 

458 

37680 

14 

31170 

45 

465 

476 

459 

39414 

14 

31174 

45 

466 

477 

453 

36786 

14 

31174 

45 

467 

478 

476 

41041 

14 

31174 

46 

468 

479 

471 

38556 

14 

31174 

46 

469 

480 

460 

38464 

14 

31174 

45 

470 

481 

477 

38209 

14 

31174 

46 

471 

482 

463 
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Appendix  F 


Fit/Forecasting  the  Notional  C-17  Data  Using  15  Observations 

and  a 

Hold-out  Sample  of  8  Observations 
(Work  done  in  EXCEL) 


Table  F-1  Detailed  Summary  of  Fit/Forecasting  the  Notional  C-17  Data 


Log-Liifiear 


Parameters: 


SSEdst  15): 


SSE(aU  23): 


Equation: 


277.2899795 


870.72 


1040.10 


Y(x)=a*x'^b 


SSE(lst  15): 


SSE(aI123) 


Equation:  Y(x)=a*x'^b+cmin 


1698.48 


291.11 


386.54 


St^iiibrd*B 


Parameters 


SSEdst  15): 


SSE(all  23): 


WB 


mmm 


68.2 


359.22 


364.74 


Y  (x)=:a*(x+beta)  '^n 


0.88464 


1266.93 


1127.15 


beta* 


-3.1519 


SSE  (1st  15): 


SSE  (all  23) 


Equation:  AR(1)  =  phi*(ARl-l)+const+error 


mmm 


msssm 


SSE  (lstl5): 


SSE  (all  23): 


Equation: 


phLl*  I  phi_2*  I  const* 


0.9010377  -0.0341  -0.253 


1340.36 


1753.94 


AR(2)  =  phLl*(AR2“l)+  phL2*(AR2-2)+const+error 


Parameters:  | 

phLl*  I 

phi_2* 

I  phL3* 

I  const* 

0.88898 

-0.00943 

-0.03295 

0.481052734 

SSE  (1st  15);  1405.53 


SSE  (all  23):  1753.98 


Equation:  AR(3)  =  phi_l*(AR3-l)+ phL2*(AR3-2)+phL3*(AR3-2)+ 

const+noise 


F-1 


(more)  Table  F-1  Detailed 


AR(4) 


Parameters: 


phl_l* 


0.85284 


1302.12 


1503.48 


Equation:  AR(4)  =  phi. 


Summary  of  Fit/Forecasting  the  Notional  C-17  Data 


phi_2*  I  phi_3*  phi_4*  I  const* 


-0.01134  0.02524  -0.08194  2.673475955 


SSE  (1st  15): 


1*(AR4-1)+  phi^2*(AR4-2)+phi_3*(AR4-3)+ 
phL4*(AR4-4)+const+noise  _ 


ARIViA(l,l) 


Parameters 


SSE  (1st  15): 


SSE  (all  23) 


Equation: 


Parameters: 


SSE  (1st  15): 


SSE  (all  23) 


Equation: 


Parameters: 


SSE  (1st  15): 


SSE  (all  23): 


Equation: 


pRMA|2;;2):;  ^ 


Parameters: 


SSE  (1st  15): 


SSE  (all  23) 


Equation: 


16.9352 


89.91 


398.09 


ARMA(14)=phi*(ARMA(14)-l)+niuprime-theta*errorlast+noise 


15.9419 


3698.99 


10270.02 


ARMA(14)= 


21.6784 


329.95 


572.27 


ARMA(24)= 


phil* 


0.89942 


thetal*  I  theta2* 


-0.49149  0.13381 


phil*(ARMA(l,2)-l)+niuprime-thetal*errorlast- 

theta2*errorlastlast+noise 


:phil*(ARMA(24)-l)+phi2*(ARMA(24)-2)+ 
muprime-theta*errorlast+noise  _ 


20.4647 


302.85 


524.07 


ARMA(2,2)= 


phil* 


0.41473 


phi2* 


-0.07662 


thetal* 


-0.20164 


theta2* 


-0.13734 


rphil*(ARMA(2,2)-l)+phi2*(ARMA(2,2)-2)+ 

muprime-thetal*errorlast-theta2*errorlastlast+noise 


Table  F-2  Brief  Summary  of  Fit/Forecasting  the  Notional  C-17  Data 


Modd 

SSE  (lSt.l3fi5i 

SSE(ail23)W 

SSE  (last  8) 

RmUc  (liiM5)iliiRanlc  (all  23):  ^ 

Eog'Einear 

m.i 

1040.1 

169.4 

6 

6 

3 

Fori^ihe 

291.1 

386.5 

95.4 

2 

2 

2 

Sianford-B 

359.2 

364.7 

5.5 

5 

1 

1 

mm 

1266.9 

Mill 

460.8 

7 

8 

10 

AR(2) 

1340.4 

1753.9 

413.6 

9 

9 

9 

AR(3)  ! 

1405.5 

1754.0 

348.4 

10 

10 

8 

AR(4) 

1302.1 

1503.5 

201.4 

8 

7 

4 

AR]VIA(14) 

89.9 

398.1 

308.2 

1 

3 

7 

AR34A(1)2) 

3699.0 

10270.0 

6571.0 

11 

11 

11 

ARMA(2,1) 

329.9 

572.3 

242.3 

4 

5 

6 

ARMA<2,2) 

302.9 

524.1 

221.2 

3 

4 

5 

Parameters: 

a* 

b* 

277,2899795 

-0,907621213 

SSE(lstl5): 

8.707E+02 

SSE(all  23): 

t040E+03 

Equation: 

Y(x)=a*x''b 

Figure  F-1  EXCEL,  Log-Linear  Model  Fit/Forecast  of  Notional  C-17  Data 


Log-Linear/C-17  Forecast  (from  15  unit  history) 


Parameters: 

a* 

b* 

cmin* 

1698.476005 

^2.500637253 

26.7088079? 

SSE(lst  15): 

2.9I1E4-02 

SSE(alI  23): 

3.S65E-^02 

Equation: 

Y(x)=a’*'x'^b+cmin 

Figure  F-2  EXCEL,  Forsythe  Fit/Forecast  of  Notional  C-17  Data 


Forsythe/C-17  Forecast  (from  15  unit  history) 


Observation  Number 


20  ] 


Parameters: 

phi_l*  phi_2*  const* 

'a9oi(i377i  7  -p.232825|Pt 

SSE  (lstl5): 

'■v"'I.3M+03  r 

SSE(aU23): 

Equation: 

AR(2)  =  phi_l*(Ysubx-l)+  phi_2*(Ysubx-2)+const+error 

Figure  F-5  EXCEL,  AR(2)  Fit/Forecast  of  Notional  C-17  Data 


AR(2)/C-17  Forecast  (from  15  unit  history) 


Parameters:  phi_l*  phi_2*  phi_3*  const* 

0.8SS975471  -0.009425773  -0.032953591  0.481052734 

SSEdstlS):  1.400E+03 

SSE  (all  23):  1.754E+03 

Equation:  AR(3)  =  phi_l*(Ysubx-l)+  phL2*(Ysubx-2)+phi_3*(Ysubx-2)+const+error 


Figure  F-6  EXCEL,  AR(3)  Fit/Forecast  of  Notional  C-17  Data 


AR(3)/C-17  Forecast  (from  15  unit  history) 


Parameters:  phi_l*  phi_2*  phL3*  phi_4*  const* 

0.852839607  ^0,0)1344546  0,025242066  -0.081943217  2.673475955 

SSE(lstl5):  '  L302E-f03 

SSE(alI23):  1.503E4<03 

Equation:  AR(4)  =  phi_l*(Ysubx-l)+  phi_2*(Ysubx-2)+phi_3*(Ysubx-2)+const+error 


Figure  F-7  EXCEL,  AR(4)  Fit/Forecast  of  Notional  C-17  Data 


Parameters: 

SSE  (1st  25): 
SSE  (all  500): 
Equation: 


muprime*  theta*  phi* 

ARMA(l,l)=phi*(ARMA(l,l)-l)+muprime-theta*errorlast+noise 


Figure  F-8  EXCEL,  ARMA(1,1)  Fit/Forecast  of  Notional  C-17  Data 


ARMA(1,1)/C"17  Forecast  (from  15  unit  history) 


Parameters:  muprime’*'  phil*  thetal*  theta2* 

15.9419  0.8994  -04915  0.13380841 

SSE(lstl5):  3.699E+03 

SSE(all23):  1.027E404 

Equation:  ARMA(l^)=phil*(ARMA(l,2)-l)+muprime-thetal*errorlast- 

theta2*errorlastlast+noise  _ 


Figure  F-9  EXCEL,  ARMA(1,2)  Fit/Forecast  of  Notional  C-17  Data 

ARMA(1,2)/C-17  Forecast  (from  15  unit  history) 


S 

I  60 

X 

40 


Observation  Number 


Parameters: 

muprime*  phil*  phi2* 

theta* 

A  4AJA 

SSE  (1st  15): 

' '21,6784  0.4097  -U.1U621U9/0 

3.299E+02 

SSE  (all  23): 

5.723E402 

Equation: 

ARMA(2,l)=phil*(ARMA(2,l)-l)+phi2*(ARMA(2,l)-2)+ 

muprime-theta*errorlast+noise 

_ 1 

Figure  F-10  EXCEL,  ARMA(2,1)  Fit/Forecast  of  Notional  C-17  Data 

ARMA(2,1)/C-17  Forecast  (from  15  unit  history) 


Observation  Number 


Parameters: 


muprime*  thetal*  theta2* 

20.4647  0.4147  -0.0766  -0.201639264  -0;i373427'79 

SSEdstlS):  3.(K29E+02 

SSE(all23):  5.241E402 

Equation:  ARMA(2,2)=phil*(ARMA(2,2)-l)+phi2*(ARMA(2,2)-2)+ 

_ . _ muprime-thetal*errorlast-theta2*errorlastlast+noise _ 


Figure  F-11  EXCEL,  ARMA(2,2)  Fit/Forecast  of  Notional  C-17  Data 


Table  F-3  The  Notional  C-17  Data  Base 


1  Unit  Number 

Adj.  Hours 

3.5 

100 

5 

63.04729214 

6 

40.06102212 

7 

39.63005339 

8 

33.24942792 

9 

35.25934401 

10 

32.31121281 

11 

25.27841342 

12 

31.03928299 

13 

34.94279176 

14 

39.33638444 

15 

27.60869565 

16 

23.78718535 

17 

28.80244088 

18 

23.69946606 

19 

23.85964912 

20 

23.07398932 

21 

24.00839054 

22 

21.73150267 

23 

23.67276888 
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Epilogue 


The  Data  —  A  Humorous  Aside 

As  an  entertaining  aside  which  may  actually  not  survive  the  editorial  shears  of  my 
advisor.  I’ve  got  to  say  that  searching  for  a  data  base  on  which  to  test  the  aforedeveloped 
models  turned  out  to  be  quite  the  formidable  and  frustrating  task!  In  the  final  frantic 
throws  of  my  thesis  process  (and  under  penalty  of  not  graduating....),  I  searched  high  and 
low  for  this  data.  My  advisor  searched  high  and  low  too  and  even  went  so  far  as  to  call  in 
some  personal  debts  --  all  to  no  avail  (???). 

My  search  led  me  to  an  old  friend  in  the  Defensive  Avionics  System  Program 
Office  (SPO)  here  on  base.  He  pointed  me  towards,  what  he  termed,  the  ‘manufacturing 
pukes’  (MPs)  in  the  aircraft  SPOs.  One  after  another,  the  MPs  suggested,  “Sorry,  we 
can’t  help  you.  You’ll  need  to  go  directly  to  the  manufacturer  to  get  that  kind  of  data! 

Be  forewarned,  however,  they  might  consider  it  proprietary  and  opt  not  to  give  it  to  you.” 

So,  that  suggestion  fresh  in  my  mind,  I  went  to  the  Aeronautical  Systems  Center 
(ASC)  Technical  Library  to  see  what  they  had  before  attempting  an  assault  on  the 
manufacturers.  When  the  people  at  the  Tech  Library  heard  what  I  wanted,  they  did  the 
proverbial,  “You  want  it  WHEN????!!!”  Then  they  told  me,  “Sorry,  we  can’t  help  you. 
You’ll  need  to  go  directly  to  the  manufacturers  to  get  that  kind  of  data.  Be  forewarned, 
however,  they  might  consider  it  proprietary  and  opt  not  to  give  it  to  you.”  (Sound 
familiar?) 

Okay,  manufacturers,  here  I  come!  Upon  questioning,  the  one  fellow  I  did 
manage  to  get  in  contact  with  at  an  un-named  aircraft  manufacturing  company  (my 
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middle  name  is  chicken  or  I’d  name  the  company  as  well  as  the  individual)  said,  “I  can’t 
help  you;  that  data  is  proprietary.  You’d  do  best  to  talk  to  the  people  at  the  SPOs.” 

Geee  Wizzz!  I  think  I’m  on  my  own. 

This  tale  of  woe  does  have  a  good  as  well  as  an  entertaining  ending.  In  my  final 
search  for  a  lead  and  one  last  attempt  to  secure  my  graduation,  I  spoke  to  the  Operations 
Research  Department  Head,  Lieutenant  Colonel  Paul  Auclair.  He  promptly  pointed  me 
towards  the  Logistic  School  (another  school  at  AFIT)  to  a  man  named  Dr.  V aughn  (AFIT 
Department  of  Research),  who  promptly  pointed  me  towards  Professor  Roland  Kankey 
(AFIT  Department  of  Acquisition)  who  promptly  pointed  me  (and  quite  accurately  so!)  to 
the  ASC  Cost  Library  where  I  found  the  data  for  which  I’d  been  so  fervently  searching. 

I  left  the  library  with  a  plethora  of  publications  with  dates  ranging  from  1949  to 
1982.  The  study  dated  1982  and  entitled  The  Learning  Curve  in  the  Airframe  Industry 
(Brewer,  1982)  is  the  source  of  my  actual  aircraft  build  history  data  —  the  1000  unit 
history  of  the  F-102.  Before  I  discovered  my  wonderful  data  base,  and  while  leafing 
through  the  stack  of  documents,  I  found  an  interesting  comment  in  the  work  entitled  An 
Airframe  Production  Function  (Alchian,  1949): 

“Within  individual  airframe  manufacturing  facilities,  estimates  of  costs  of  producing 
airplanes  are  frequently  based  on  the  learning  curve.  Unfortunately,  no  data  on  the 
estimates  made  by  aircraft  manufacturing  facilities  themselves  have  been  available,  so 
tests  of  their  reliability  could  not  be  made.  Naturally  enough,  individual  business  units 
are  not  willing  to  publicly  reveal  their  cost  estimating  records.” 

I  guess  maybe  he  didn’t  graduate  on  time. 
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